13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Integrating Adaptive Beam-forming and Auditory Features for Robust Large Vocabulary Speech Recognition

Xie Sun, Qi Peter Li, Manli Zhu, Qiru Zhou

Li Creative Technologies (LcT), Inc., Florham Park, NJ, USA

We demonstrate a system to integrate adaptive beam-forming and auditory features in order to improve speech recognition accuracy in noisy environments. Adaptive beam-forming based on a microphone array can utilize spatial information to improve the sound recording signal-to-noise ratio (SNR) on a focused speaker for robust speech recognition. Auditory features based on modeling the signal processing functions in the hearing system have shown to largely improve speech recognition accuracy under noisy conditions. According to our experiments, when both adaptive beam-forming and the auditory features are integrated, an absolute gain of more than 50% over a baseline on speech recognition accuracy is achieved when 5dB white noise is added.

Index Terms: adaptive beam-forming, auditory features, robust speech recognition, SNR

Full Paper

Bibliographic reference.  Sun, Xie / Li, Qi Peter / Zhu, Manli / Zhou, Qiru (2012): "Integrating adaptive beam-forming and auditory features for robust large vocabulary speech recognition", In INTERSPEECH-2012, 2115-2116.