Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

An Approach to Unlimited Vocabulary Continuous Speech Recognition Based on Context-Dependent Phoneme Modeling

Y. Abe, K. Nakajima

Computer and Information Systems Laboratory, Mitsubishi Electric Corporation, Kamakura, Japan

This paper describes a continuous speech recognition system searching an arbitrary phoneme string applying a contextual model. Two contextual models, called Mixture Multiple Linear Phonetic-Context Model(MM-LPCM) and Mixture Single LPCM(MS-LPCM), are presented. Both models are designed to represent more complicated variations not represented by the original LPCM. In isolated phoneme recognition experiments, the MM-LPCM achieved the minimum error rate of 4.6% which is lower than the original LPCM by 1.4 points. Context-dependent search is based on a hypothesis-and-test scheme, in which a phoneme string hypothesis is expanded by appending one phoneme at a time considering the left and right contexts. A mechanism to bound the phonemic boundaries is introduced to reduce insertion errors in the search. The context-dependent search algorithm with the improved contextual model achieved the total phoneme error rate of 12.9%, which was half of context-independent search.

Full Paper

Bibliographic reference.  Abe, Y. / Nakajima, K. (1992): "An approach to unlimited vocabulary continuous speech recognition based on context-dependent phoneme modeling", In ICSLP-1992, 1547-1550.