First International Conference on Spoken Language Processing (ICSLP 90)
This paper describes a continuous speech recognition system using phonemes as basic units in both of acoustic and linguistic processings. Acoustically the phoneme is difficult to be identified since it is strongly varied by the phonemic contexts. In order to cope with the contextual variability of the phonemes, we use a linear model called Linear Phonetic-Context Model(LPCM), which represents acoustical features as the sum of context-independent and context-dependent components. Incorporating with the LPCM we design a phoneme-based phrase recognition algorithm which accepts speech input of an arbitrary string of phonemes. The algorithm obtains plural recognition candidates using an end-point spotting method along with a beam-search technique. The language model bases on task-independent statistics of phoneme strings, and gives a probability of a phrase not existing in the corpus avoiding the null probability problem as in the simple N-gram model. In experiments to recognize 336 phrases extracted from 50 sentences spoken by a male speaker, we obtained a phoneme recognition rate of 95.0% and a phrase recognition rate of 67.9% without limiting a vocabulary.
Bibliographic reference. Abe, Yoshiharu / Nakajima, Kunio (1990): "Vocabulary independent phrase recognition with a linear phonetic context model", In ICSLP-1990, 1189-1192.