14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Damped Oscillator Cepstral Coefficients for Robust Speech Recognition

Vikramjit Mitra, Horacio Franco, Martin Graciarena

SRI International, USA

This paper presents a new signal-processing technique motivated by the physiology of human auditory system. In this approach, auditory hair cells are modeled as damped oscillators that are stimulated by bandlimited time domain speech signals acting as forcing functions. Oscillation synchrony is induced by time aligning and three-way coupling of the forcing functions across the individual bands such that a given oscillator is induced not only by its critical band's forcing function but also by its two neighboring functions. We present two separate features; one which uses the damped oscillator response to the forcing functions without synchrony which we name as the Damped Oscillator Cepstral Coefficient (DOCC) and the other which uses the damped oscillator response to a time synchronized forcing function and we name it as the Synchronized Damped Oscillator Cepstral Coefficient (SyDOCC). The proposed features are used in an Aurora4 noiseand channel-degraded speech recognition task, and the results indicate that they improved speech-recognition performance in all conditions compared to the baseline mel-cepstral feature and other published noise robust features

Full Paper

Bibliographic reference.  Mitra, Vikramjit / Franco, Horacio / Graciarena, Martin (2013): "Damped oscillator cepstral coefficients for robust speech recognition", In INTERSPEECH-2013, 886-890.