Third International Conference on Spoken Language Processing (ICSLP 94)
In this paper we describe a method that trains HMMs from non-label training data by using concatenated training. We must solve two problems in using training speech data and corresponding text to perform concatenated training. One problem is how to make a correspondence between text and phonetic descriptions, and the other is how to get a one-and-only phonetic description which corresponding to the real utterance. We use the speech recognition as a preprocess before concatenated training. We create a finite state automaton to find the correct phonetic labels. This automaton connects sequentially all the words, from the first to the last of the training sentence, allows a pause between any two words, and aligns all possible labels when fail in transcribing a word to a one-and-only phonetic label. In a continuous speech recognition experiment, word accuracy has been improved from 91.3% to 95.2%.
Bibliographic reference. Yi, Me (1994): "Concatenated training of subword HMMs using detected labels", In ICSLP-1994, 303-306.