4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Comparison of Channel Normalisation Techniques for Automatic Speech Recognition Over the Phone

Johan de Veth (1), Louis Boves (1,2)

(1) Department of Language and Speech, University of Nijmegen, Nijmegen, The Netherlands
(2) KPN Research, Leidschendam, The Netherlands

We compared three different channel normalisation (CN) methods in the context of a connected digit recognition task over the phone: ceptrum mean substraction (CMS), RASTA filtering and the Gaussian dynamic cepstrum representation (GDCR). Using a small set of context-independent (CI) continuous Gaussian mixture hidden Markov models (HMMs) we found that CMS and RASTA outperformed the GDCR technique. We show that the main cause for the superiority of CMS compared to RASTA is the phase distortion introduced by the RASTA filter. Recognition results for a phase-corrected RASTA technique are identical to those of CMS. Our results indicate that an ideal cepstrum based CN method should (1) effectively remove the DC-component, (2) at least preserve modulation frequencies in the range 2-16 Hz and (3) introduce no phase distortion in case CI HMMs are used for recognition.

Full Paper

Bibliographic reference.  Veth, Johan de / Boves, Louis (1996): "Comparison of channel normalisation techniques for automatic speech recognition over the phone", In ICSLP-1996, 2332-2335.