EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


On Integrating the Lexicon with the Language Model

Diamantino Caseiro, Isabel Trancoso

INESC-ID/IST, Portugal

The goal of this work was to develop an algorithm for the integration of the lexicon with the language model which would be computationally efficient in terms of memory requirements, even in the case of large trigram models. Two specialized versions of the algorithm for transducer composition were implemented. The first one is basically a composition algorithm that uses the precomputed set of the output labels that can be reached from a particular epsilon edge of the lexicon; the second includes an "on the fly" implementation of the pushing of weights and output labels. Very significant memory savings were obtained with the proposed algorithms compared with the general determinization algorithm for weighted transducers.

Full Paper

Bibliographic reference.  Caseiro, Diamantino / Trancoso, Isabel (2001): "On integrating the lexicon with the language model", In EUROSPEECH-2001, 2131-2134.