4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Subband-Crosscorrelation Analysis for Robust Speech Recognition

Shoji Kajita, Kazuya Takeda, Fumitada Itakura

Graduate School of Engineering, Nagoya University, Chikusa-ku, Nagoya, Japan

This paper describes subband-crosscorrelation (SBXCOR) analysis using two channel signals. The SBXCOR analysis is an extended signal processing technique of subband-autocorrelation (SBCOR) analysis that extracts periodicities present in speech signals. In this paper, the performance of SBXCOR is investigated using a DTW word recognizer, under simulated acoustic conditions on computer and a real environmental condition. Under the simulated condition, it is assumed that speech signals in each channel are perfectly synchronized while noises are not correlated. Consequently, the effective signal-to-noise ratio of the signal generated by simply summing the two signals is raised about 3dB. In such a case, it is shown that SBXCOR is less robust than SBCOR extracted from the two-channel-summed signal, but more robust than the conventional one-channel SBCOR. The resultant performance was much better than that of smoothed group delay spectrum and mel-frequency cepstral coefficient. In a real computer room, it is shown that SBXCOR is more robust than the two-channel-summed SBCOR.

Full Paper

Bibliographic reference.  Kajita, Shoji / Takeda, Kazuya / Itakura, Fumitada (1996): "Subband-crosscorrelation analysis for robust speech recognition", In ICSLP-1996, 422-425.