13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Dynamic Grammars with Lookahead Composition for WFST-based Speech Recognition

Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose

Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan

Automatic Speech Recognition (ASR) applications often employ a mixture of static and dynamic grammar components, and can thus benefit from the ability to efficiently modify the system vocabulary and other parameters in an on-line mode. This paper presents a novel, generic approach to dynamic grammar handling in the context of the Weighted Finite-State Transducer (WFST) paradigm. The method relies on a straightforward extension of the lexicon and underlying grammar components, and leverages the ideas of on-the-fly composition and delayed construction to efficiently generate the recognition search space on-the-fly. The alternative partitioning of component models that this approach implies can also result in significant storage savings. In contrast to previous works in this area, the proposed method relies only on generic WFST operations and the context-dependency, lexicon and grammar components that form the basis of standard ASR cascades.

Index Terms: WFST, ASR, Dynamic vocabulary, Spoken dialog systems

Full Paper

Bibliographic reference.  Novak, Josef R. / Minematsu, Nobuaki / Hirose, Keikichi (2012): "Dynamic grammars with lookahead composition for WFST-based speech recognition", In INTERSPEECH-2012, 1079-1082.