Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Cepstral Channel Normalization Techniques for HMM-Based Speaker Verification

Aaron E. Rosenberg, Chin-Hui Lee, Frank K. Soong

Speech Research Department, AT&T Bell Laboratories, Murray Hill, NJ, USA

Mismatched recording and channel conditions for training sessions and verification trials can lead to serious performance degradations for speaker verification systems. The effect of linear channel distortions can be compensated by subtracting the cepstrum attributable to the distortion from the cepstrum of the observed signal. Three cepstral normalization techniques have been studied to evaluate their effect on performance of a speaker verification system with a telephone network database of connected digit password utterances. The three techniques represent cepstral distortion as a long term cepstral average, short term cepstral average, and as a maximum likelihood estimate of the observed cepstrum with respect to HMM parameters. Overall, verification performance improves 30 to 45% with cepstral normalization over a baseline condition. The greater improvements are obtained for longer utterances. No significant differences in performance are found for the three techniques.

Full Paper

Bibliographic reference.  Rosenberg, Aaron E. / Lee, Chin-Hui / Soong, Frank K. (1994): "Cepstral channel normalization techniques for HMM-based speaker verification", In ICSLP-1994, 1835-1838.