13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition

Yi Ren Leng, Huy Dat Tran

Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore

The Missing Feature Linear-Frequency Cepstral Coefficients (MF-LFCC) is a noise robust cepstral feature that transforms both clean and noisy signals into a similar representation. Unlike conventional Missing Feature Techniques, the MF-LFCC does not require spectrogram imputation or classifier modification. To improve the noise mask used in the MF-LFCC, we propose to use the computer vision technique of blob detection to identify the peaks characterizing the sparsity of sound event spectrograms. For single sound event recognition using SVM classifiers, the MF-LFCC is shown to significantly outperform the MFCC baseline and the noise robust ESTI Advanced Front End feature in noisy conditions.

Index Terms: blob detection, missing feature, robust recognition, sound event recognition

Full Paper

Bibliographic reference.  Leng, Yi Ren / Tran, Huy Dat (2012): "Using blob detection in missing feature linear-frequency cepstral coefficients for robust sound event recognition", In INTERSPEECH-2012, 2506-2509.