4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Overview of Speech Enhancement Techniques for Automatic Speaker Recognition

Javier Ortega-García, Joaquín González-Rodríguez

Dept. de Ingeniería Audiovisual y Comunicaciones, Universidad Politécnica de Madrid, Madrid, Spain

Real world conditions differ from ideal or laboratory conditions, causing mismatch between training and testing phases, and consequently, inducing performance degradation in automatic speaker recognition systems [1]. Many strategies have been adopted to cope with acoustical degradation; in some applications of speaker identification systems a clean sample of speech, prior to the recognition stage, is needed. This has justified the use of procedures that may reduce the impact of acoustical noise on the desired signal, giving rise to techniques involved in the enhancement of noisy speech [2, 9]. In this paper, a comparative performance analysis of single-channel (based in classical spectral subtraction and some derived alternatives), dual-channel (based in adaptive noise cancelling) and multi-channel (using microphone arrays) speech enhancement techniques, with different types of noise at different SNRs, as a pre-processing stage to an ergodic HMM-based speaker recognizer, is presented.

