4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
This paper proposes improved methods of smoothed spectral subtraction to enhance the recognition performance of a frequency- weigh ted HMM (HMM-FVV) in very noisy environments. The conventional spectral subtraction tends to produce discontinuity in estimated power spectra. This distortion is undesirable for HMM-FW which uses group delay spectra as feature vectors. In order to remove this distortion, this paper proposes two frequency smoothing methods in log-spectral domain: (1) a low-pass Liftering by JDCT, and (2) a weighted minimum mean square error method (WMSE) which fits cosine series to an estimated log-power spectrum. The results shows that the smoothers are very effective under very noisy conditions, especially for the frequency-weighted HMM. The WMSE method combined with HMM-FW achieves the highest recognition accuracies, for instance, improving recognition rate from 687c to 88% at -6dB SNR of car noise.
Bibliographic reference. Matsumoto, Hiroshi / Naitoh, Noboru (1996): "Smoothed spectral subtraction for a frequency-weighted HMM in noisy speech recognition", In ICSLP-1996, 905-908.