4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Improved Extended HMM Composition by Incorporating Power Variance

Yasuhiro Minami, Sadaoki Furui

NTT Human Interface Laboratories, Musashino-shi, Tokyo, Japan

This paper describes a way of improving extended HMM composition that can precisely adapt HMMs to both noisy and distorted speech. To do this, we incorporate the variance of power into extended HMM composition using quantization to approximate the Gaussian distribution of the 0th order cepstrum. Consequently, a distribution of noisy speech is approximated in the linear spectral domain as a mixture of log normal distributions. This method is evaluated by a four-digit recognition experiment when the number of digits is known. Two types of noise, computer room noise and car noise, are used and noisy and distorted speech data is made by adding these types of noise to speech data recorded using a boundary microphone. Results show that the proposed method improves recognition rates for noisy and distorted speech compared with our previous method.

Full Paper

Bibliographic reference.  Minami, Yasuhiro / Furui, Sadaoki (1996): "Improved extended HMM composition by incorporating power variance", In ICSLP-1996, 1109-1112.