4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Smoothed Spectral Subtraction for a Frequency-Weighted HMM in Noisy Speech Recognition

Hiroshi Matsumoto, Noboru Naitoh

Dept. of Electrical & Electronic Eng.. Faculty of Engineering. Shinshu University, Nagano-shi, Nagano, Japan

This paper proposes improved methods of smoothed spectral subtraction to enhance the recognition performance of a frequency- weigh ted HMM (HMM-FVV) in very noisy environments. The conventional spectral subtraction tends to produce discontinuity in estimated power spectra. This distortion is undesirable for HMM-FW which uses group delay spectra as feature vectors. In order to remove this distortion, this paper proposes two frequency smoothing methods in log-spectral domain: (1) a low-pass Liftering by JDCT, and (2) a weighted minimum mean square error method (WMSE) which fits cosine series to an estimated log-power spectrum. The results shows that the smoothers are very effective under very noisy conditions, especially for the frequency-weighted HMM. The WMSE method combined with HMM-FW achieves the highest recognition accuracies, for instance, improving recognition rate from 687c to 88% at -6dB SNR of car noise.

Full Paper

Bibliographic reference.  Matsumoto, Hiroshi / Naitoh, Noboru (1996): "Smoothed spectral subtraction for a frequency-weighted HMM in noisy speech recognition", In ICSLP-1996, 905-908.