13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech Recognition

Kenichi Kumatani (1,2), Bhiksha Raj (1), Rita Singh (1), John McDonough (1)

(1) School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
(2) Disney Research, Pittsburgh (DRP), PA, USA

This paper presents a new microphone-array post-filtering algorithm for distant speech recognition (DSR). Conventionally, post-filtering methods assume static noise field models, and using this assumption, employ a Wiener filter mechanism for estimating the noise parameters. In contrast to this, we show how we can build the Wiener post-filter based on actual noise observations without any noise-field assumption. The algorithm is framed within a state-of-the-art beamforming technique, namely maximum negentropy (MN) beamforming with super directivity. We investigate the effectiveness of the proposed post-filter on DSR through experiments on noisy data collected in a car under different acoustic conditions. Experiments show that the new post-filtering mechanism is able to achieve up to 20% relative reduction of word error rates (WER) under the represented noise conditions, as compared to a single distant microphone. In contrast, super-directive (SD) beamforming followed by Zelinski post-filtering achieves a relative WER reduction of only up to 11%. Other post-filters evaluated perform similarly in comparison to the proposed post-filter.

Index Terms: Microphone array, Post-filter, Distant speech recognition, Automotive speech application

Full Paper

Bibliographic reference.  Kumatani, Kenichi / Raj, Bhiksha / Singh, Rita / McDonough, John (2012): "Microphone array post-filter based on spatially-correlated noise measurements for distant speech recognition", In INTERSPEECH-2012, 298-301.