Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Noise Robust Model-Based Voice Activity Detection

Ángel de la Torre, Javier Ramírez, Carmen Benítez, José C. Segura, L. García, Antonio J. Rubio

Universidad de Granada, Spain

We propose a model-based VAD derived from the Vector Taylor Series (VTS) approach. A Gaussian mixture (trained with clean speech) is used in order to provide an appropriate decision rule for speech/nonspeech detection. Additionally, VTS approach adapts the Gaussian mixture to noise conditions, yielding a stable performance for a wide range of SNRs. We have evaluated its ability for speech/non-speech detection and also its application for robust speech recognition. When compared to other VAD methods, the proposed VAD shows the best trade-off in speech/non-speech detection. When applied for Wiener Filtering and for frame dropping, the proposed VAD also provides the best recognition results.

Full Paper

Bibliographic reference.  Torre, Ángel de la / Ramírez, Javier / Benítez, Carmen / Segura, José C. / García, L. / Rubio, Antonio J. (2006): "Noise robust model-based voice activity detection", In INTERSPEECH-2006, paper 1476-Wed3A1O.1.