EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Feature Vector Selection to Improve ASR Robustness in Noisy Conditions

Johan de Veth (1), Laurent Mauuary (2), Bernhard Noe (3), Febe de Wet (1), JŘrgen Sienel (3), Louis Boves (1), Denis Jouvet (2)

(1) University of Nijmegen, The Netherlands; (2) France TÚlÚcom R&D, France; (3) Alcatel SEL, Germany

It is well known that noise reduction schemes are beneficial in ASR to reduce training-test mismatch due to noise. However, a significant mismatch may still remain after noise reduction, especially in the nonspeech portions of the signals. To reduce the impact of this mismatch, two methods for discarding non-speech acoustic vectors at recognition time are investigated: variable frame rate processing and voice activity detection. Experiments are discussed for Aurora 2 and for SpeechDat Car Italian. Results show that both methods are highly effective for SpeechDat Car Italian. However, for Aurora 2, feature vector selection based on voice activity detection hardly gives a benefit, while variable frame rate processing actually lowers recognition accuracy somewhat. Several possible explanations of the different results observed for the two databases are discussed.

Full Paper

Bibliographic reference.  Veth, Johan de / Mauuary, Laurent / Noe, Bernhard / Wet, Febe de / Sienel, JŘrgen / Boves, Louis / Jouvet, Denis (2001): "Feature vector selection to improve ASR robustness in noisy conditions", In EUROSPEECH-2001, 201-204.