First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Features for Noise-Robust Speaker-Independent Word Recognition

Brian A. Hanson, Ted H. Applebaum

Speech Technology Laboratory, Division of Panasonic Technologies, Inc., Santa Barbara, CA, USA

Effects such as additive noise and noise-induced changes in vocal effort (Lombard effect) can cause significant loss of performance for recognizers trained on normal (non-noisy, non-Lombard) speech. In earlier work, improvements to recognition rate over a "standard" speech representation consisting of cepstral coefficients and their first time-derivative (calculated over a 50 msec interval) were demonstrated on the English digits vocabulary by lengthening the interval over which the first derivative is calculated and incorporating a second derivative feature. The current paper extends this work by considering recognition of a much more confusable vocabulary. The recognition results are analyzed for each proposed change in the speech representation, examined by confusable subsets of the vocabulary and contrasted with previous results. Most of the earlier findings for the digits vocabulary were confirmed for the confusable vocabulary. Additionally, it was found that adding a third derivative feature further enhances performance.

Full Paper

Bibliographic reference.  Hanson, Brian A. / Applebaum, Ted H. (1990): "Features for noise-robust speaker-independent word recognition", In ICSLP-1990, 1117-1120.