4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

A Binaural Model as a Front-end for Isolated Word Recognition

Tsuyoshi Usagawa (1), Markus Bodden (2), Klaus Rateitschek (2)

(1) Kumamoto University, Kumamoto, Japan
(2) Ruhr University of Bochum, Bochum, Germany

Small vocabulary isolated word speech recognition can be implemented on relative small hardware. Although the recognition problem is more or less solved in noise-free situations, the general application is hindered because of the dramatic decrease of performance in noisy environments, especially for hands-free applications. In this paper a binaural front-end for speech recognition is presented. This binaural model, which was originally developed at Ruhr-University of Bochum in Germany, allows for an effective reduction of interfering noises of any kind. Besides stationary noises also concurrent speech signals can be suppressed. The original model was designed as a precise computer model of the human binaural auditory system and can explain a variety of psycho-acoustical phenomenon. Besides those abilities the model offers sharp directional selectivity which is superior to those obtained with directional microphones. We simplified this sophisticated model by adapting it to the specific task and use the peak position and the peak level of the binaural activity pattern for each frequency band as a parameter for pattern matching. The performance was evaluated in the form of recognition rates for a variety of difference noisy environments. The results show that the binaural front-end leads to a significant improvement in recognition rates corresponding to an enhancement of over 20dB in SNR in most cases.

Full Paper

Bibliographic reference.  Usagawa, Tsuyoshi / Bodden, Markus / Rateitschek, Klaus (1996): "A binaural model as a front-end for isolated word recognition", In ICSLP-1996, 2352-2355.