14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Regularized MVDR Spectrum Estimation-Based Robust Feature Extractors for Speech Recognition

Md. Jahangir Alam (1), Patrick Kenny (2), Douglas O'Shaughnessy (1)

(1) INRS-EMT, Canada
(2) CRIM, Canada

In this paper, we present two robust feature extractors that use a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, for estimating the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high variance and they perform poorly under noisy and adverse conditions. RMVDR spectrum estimator has low spectral variance and are robust to mismatch conditions. Based on RMVDR spectrum estimator two robust feature extractors, robust RMVDR cepstral coefficients (RRMCC) and normalized RMVDR cepstral coefficients (NRMCC), are proposed that incorporate an auditory domain spectrum enhancement (ASE) method and a medium duration power bias subtraction (MDPBS) technique, respectively, for enhancement of the speech spectrum. Experimental speech recognition results are conducted on the AURORA-4 corpus and performances are compared with the MFCC, PLP, MVDR-MFCC, RMVDR-MFCC, PMVDR, ETSI advancement front-end (ETSI-AFE), PNCC, CFCC, and the robust feature extractor (RFE) of [6]. Experimental results demonstrate that the proposed robust feature extractors outperformed the other robust front-ends in terms of percentage word accuracy on the AURORA-4 large vocabulary continuous speech recognition (LVCSR) task under different mismatch conditions.

Full Paper

Bibliographic reference.  Alam, Md. Jahangir / Kenny, Patrick / O'Shaughnessy, Douglas (2013): "Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition", In INTERSPEECH-2013, 891-895.