Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Auto-Segmentation Based VAD for Robust ASR

Yu Shi, Frank K. Soong, Jian-Lai Zhou

Microsoft Research Asia, China

An auto-segmentation based endpointing algorithm for robust ASR is proposed. The algorithm consists of two successive steps: (1) homogeneous segment partitioning and (2) segment clustering. The first step, due to its self-segmentation nature, does not need a noise model, and is applicable to different noises at various SNR’s. The dynamic programming based segment partitioning, which can generate more homogeneous segments than individual frames for clustering, yields a more robust VAD mechanism. Experiments are performed on the AURORA2 digit database by comparing the new algorithm with the ETSI standard for DSR. Quantitative assessment of the new algorithm is performed via different evaluation criteria, including: ROC curves, speech/non-speech discrimination, and speech recognition performance.

Full Paper

Bibliographic reference.  Shi, Yu / Soong, Frank K. / Zhou, Jian-Lai (2006): "Auto-segmentation based VAD for robust ASR", In INTERSPEECH-2006, paper 1749-Wed3A1O.2.