EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Split-band Perceptual Harmonic Cepstral Coefficients as Acoustic Features for Speech Recognition

Liang Gu, Kenneth Rose

University of California, Santa Barbara, USA

This paper presents a significant modification of our previously proposed speech recognizer's front-end based on perceptual harmonic cepstral coefficients. The spectrum is split into two frequency bands, which correspond to the harmonic and non-harmonic components. A weighting function, which depends both on the voiced/unvoiced/ transitional classification and on the prominence of harmonic structures, is applied to the harmonic band, and ensures accurate representation of the voiced and transitional speech spectral envelope. Conventional smoothed spectrum is used in the non-harmonic band. The mixed spectrum undergoes mel-scaled band-pass filtering, and the log-energy of the filters' output is discrete cosine transformed to produce cepstral coefficients. Experiments with Mandarin digit and E-set databases show significant recognition gains over plain perceptual harmonic cepstral coefficients and considerable gains over standard techniques.

Full Paper

Bibliographic reference.  Gu, Liang / Rose, Kenneth (2001): "Split-band perceptual harmonic cepstral coefficients as acoustic features for speech recognition", In EUROSPEECH-2001, 583-586.