12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Painless WFST Cascade Construction for LVCSR - Transducersaurus

Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose

University of Tokyo, Japan

This paper introduces the Transducersaurus toolkit which provides a set of classes for generating each of the fundamental components of a typical WFST ASR cascade, including a Context-dependency transducer, a Lexicon, a stochastic language model and an optional silence class model. The toolkit further implements a simple scripting language in order to facilitate the construction of cascades with a variety of popular combination and optimization methods and provides integrated support for the T3 and Juicer WFST decoders, and both Sphinx and HTK format acoustic models. New results for two standard WSJ tasks are also provided, comparing a variety of cascade construction and optimization algorithms. These results illustrate the flexibility of the toolkit as well as the tradeoffs inherent in various build algorithms.

Full Paper

Bibliographic reference.  Novak, Josef R. / Minematsu, Nobuaki / Hirose, Keikichi (2011): "Painless WFST cascade construction for LVCSR - transducersaurus", In INTERSPEECH-2011, 1537-1540.