Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
In speech recognition using HMM, several methods have been proposed for controlling the state or word duration and their effectiveness is well known. However these methods model the duration of each state or word only, and don't consider the relation among durations of separate words within a sentence or separate states within a word. This paper proposes a new method of syllable duration control for continuous Japanese speech recognition. It constrains the syllable duration using the relation of among each of syllables, and this method is effective even if the speed of speech changes. At first the syllable duration is predicted by using the matching periods which have already been spotted and using speaker independent factors which affect syllable duration. Next, the matching period of the predicted syllable is constrained using predicted duration. Using 50 sentences and 10 speakers, we evaluate the performance of prediction and recognition. As a result, this method improves sentence recognition rate by 5.2%.
Bibliographic reference. Takizawa, Yumi / Tsuboka, Eiichi (1992): "Syllable duration prediction for speech recognition", In ICSLP-1992, 1371-1374.