4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
Development of human-machine dialog applications for messaging and information retrieval over the telephone pose stringent requirements on accuracy and speed of the automatic speech recognition (ASR) system. In this paper, we describe strategies for improved acoustic-phone modeling directed toward increasing recognition accuracy while maintaining the number of phone units low. Specifically, this paper considers: (1) The development of an improved set of head-tail context-dependent (CD) triphones. (2) A novel criterion for better selection of the number of states assigned to each phone unit based on the coefficient of variation measure of feature components in HMM-Gaussians. Performance of the models is evaluated using data that represent real telephony applications.
Bibliographic reference. Zeljkovic, Ilija / Narayanan, Shrikanth (1996): "Improved HMM phone and triphone models for realtime ASR telephony applications", In ICSLP-1996, 1105-1108.