EUROSPEECH '91

Output distributions in hidden Markov models describe essential acoustic characteristics. Triphone generalization may force two models to be merged together when only parts of the model output distributions are similar, while the rest of the output distributions are different. This problem can be avoided if clustering is carried out at the distribution level. In this paper, a shareddistribution model is proposed to replace generalized triphone models for speakerindependent continuous speech recognition. Here, output distributions in the hidden Markov model are shared with each other if they exhibit acoustic similarity. In addition to detailed representation, it also gives us the freedom to use a large number of states for each phonetic model. Although an increase in the number of states will increase the total number of free parameters, with distribution sharing we can essentially eliminate those redundant states and have the luxury to maintain necessary ones. By using the shareddistribution model, the error rate on the DARPA Resource Management task has been reduced by 20% in comparison with the baseline SPHINX system.
Bibliographic reference. Hwang, M. Y. / Huang, X. D. (1991): "Acoustic distribution clustering in phonetic hidden Markov models", In EUROSPEECH1991, 785788.