Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Speech/Non-Speech Discrimination Combining Advanced Feature Extraction and SVM Learning

Javier Ramírez, Pablo Yélamos, J. M. Górriz, José C. Segura, L. García

Universidad de Granada, Spain

This paper shows an effective speech/non-speech discrimination method for improving the performance of speech processing systems working in noisy environment. The proposed method uses a trained support vector machine (SVM) that defines an optimized non-linear decision rule over different sets of speech features. Two alternative feature extraction processes based on: i) subband SNR estimation after denoising, and ii) long-term SNR estimation were compared. Both methods show the ability of the SVM-based classifier to learn how the signal is masked by the acoustic noise and to define an effective non-linear decision rule. However, it is shown that a feature vector incorporating contextual information yielded better speech/non-speech discrimination even when no denoising is applied. The experimental analysis carried out on the Spanish SpeechDat-Car database shows clear improvements over standard VADs including ITU G.729, ETSI AMR and ETSI AFE for distributed speech recognition (DSR), and other recently reported VADs.

Full Paper

Bibliographic reference.  Ramírez, Javier / Yélamos, Pablo / Górriz, J. M. / Segura, José C. / García, L. (2006): "Speech/non-speech discrimination combining advanced feature extraction and SVM learning", In INTERSPEECH-2006, paper 1134-Wed1FoP.3.