ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing
ICC Jeju, Korea
In this paper, we focus on speech recognition using multiple microphones with varying quality. The quality of one channel may be much better than other channels and even the output of standard microphone array techniques such as the delay-and-sum beamformer. Therefore, it is important to find a good indicator to select a channel for recognition. This paper introduces Decoder-Based Channel Selection (DBCS) that gives a criterion to evaluate the quality of each channel by comparing the speech recognition hypotheses made from compensated and uncompensated feature vectors. We evaluate the performance of DBCS using speech data recorded by a PDA-like mockup. DBCS with Delta-Cepstrum Normalization for single channel compensation provides significant improvement compared to the delay-and-sum beamformer. In addition, the concept of DBCS is extended to the delayand- sum beamformer outputs of various subset of microphones. This extension gives some additional improvement of the speech recognition accuracy.
Bibliographic reference. Obuchi, Yasunari (2004): "Multiple-microphone robust speech recognition using decoder-based channel selection", In SAPA-2004, paper 52.