The recognition performance of FFT or LPC front-ends decreases dramatically when the signal-to-noise ratio (SNR) in the input speech signal is approaching 0 dB. An alternative solution is to model the auditory system, which is the best recognition system that we know. This paper analyses the speech recognition performance of an auditory-based front-end versus an FFT front-end for a large number of speakers in natural environments. First, the auditory models that we used for experiments are presented. The instantaneous firing rate is then processed by a model of central auditory system, introduced by Wu. We extended Wu's model focusing more on maximizing of recognition performance, rather than modelling subtle auditory phenomena. The particular test conditions are then described and the processing stages are illustrated on isolated words. At last, the conclusions from our experiments are drawn.
Bibliographic reference. Dobrin, Cristina / Haavisto, Petri / Laurila, Kari / Astola, Jaakko (1995): "Speech recognition experiments in a noisy environment using auditory system modelling", In EUROSPEECH-1995, 131-134.