Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes a continuous speech recognition system searching an arbitrary phoneme string applying a contextual model. Two contextual models, called Mixture Multiple Linear Phonetic-Context Model(MM-LPCM) and Mixture Single LPCM(MS-LPCM), are presented. Both models are designed to represent more complicated variations not represented by the original LPCM. In isolated phoneme recognition experiments, the MM-LPCM achieved the minimum error rate of 4.6% which is lower than the original LPCM by 1.4 points. Context-dependent search is based on a hypothesis-and-test scheme, in which a phoneme string hypothesis is expanded by appending one phoneme at a time considering the left and right contexts. A mechanism to bound the phonemic boundaries is introduced to reduce insertion errors in the search. The context-dependent search algorithm with the improved contextual model achieved the total phoneme error rate of 12.9%, which was half of context-independent search.
Bibliographic reference. Abe, Y. / Nakajima, K. (1992): "An approach to unlimited vocabulary continuous speech recognition based on context-dependent phoneme modeling", In ICSLP-1992, 1547-1550.