EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology
2nd INTERSPEECH Event

Aalborg, Denmark
September 3-7, 2001

                 

A Weight Pushing Algorithm for Large Vocabulary Speech Recognition

Mehryar Mohri, Michael Riley

AT&T Labs - Research, USA

Weighted finite-state transducers provide a general framework for the representation of the components of speech recognition systems; language models, pronunciation dictionaries, context-dependent models, HMM-level acoustic models, and the output word or phone lattices can all be represented by weighted automata and transducers. In general, a representation is not unique and there may be different weighted transducers realizing the same mapping. In particular, even when they have exactly the same topology with the same input and output labels, two equivalent transducers may differ by the way the weights are distributed along each path. We present a "weight pushing" algorithm that modifies the weights of a given weighted transducer in a way such that the transition probabilities form a stochastic distribution. This results in an equivalent transducer whose weight distribution is more suitable for pruning and speech recognition. We demonstrate substantial improvements of the speed of our recognition system in several tasks based on the use of this algorithm. We report a 45% speedup at 83% word accuracy with a simple single-pass 40,000-word vocabulary North American Business News (NAB) recognition system on the DARPA Eval '95 test set. With the same technique, we report a 550% speedup at 88% word accuracy in rescoring NAB word lattices with more accurate 2nd-pass models. We finally report a 280% speedup at 68% word accuracy for 100,000 first name-last name pairs recognition.

Full Paper

Bibliographic reference.  Mohri, Mehryar / Riley, Michael (2001): "A weight pushing algorithm for large vocabulary speech recognition", In EUROSPEECH-2001, 1603-1606.