4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
In this paper several methods are proposed for reducing the size of a trigram language model (LM), which is often the biggest data structure in a continuous speech recognizer, without affecting its performance. The common factor shared by the different approaches is to select only a subset of the available trigrams, trying to identify those trigrams that mostly contribute to the performance of the full trigram LM. The proposed selection criteria apply to trigram contexts, both of length one or two. These criteria rely on information theory concepts, the back-off probabilities estimated by the LM, or on a measure of the phonetic/linguistic uncertainty relative to a given context. Performance of the reduced trigrams LMs are compared both in terms of perplexity and recognition accuracy. Results show that all the considered methods perform better than the naive frequency shifting method. In fact, a 50% size reduction is obtained on a shift-1 trigram LM, at the cost of a 5% increase in word error rate. Moreover, the reduced LMs improve by around 15% the word error rate of a bigram LM of the same size.
Bibliographic reference. Brugnara, Fabio / Federico, Marcello (1996): "Techniques for approximating a trigram language model", In ICSLP-1996, 2075-2078.