Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

A Study of Speech Recognition System Robustness to Microphone Variations: Experiments in Phonetic Classification

Jane Chang, Victor W. Zue

Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

This paper presents experiments in phonetic classification conducted as part of a study on the effects of microphone variations on performance in speech recognition systems. The TIMIT corpus provides data recorded on a close-talking microphone, on a free field microphone and over telephone lines. The study focuses on the unmatched training and. testing conditions under which degradation is most severe. Analysis of baseline performance characterizes the effects of microphone variations. Downsampling is shown to significantly improve performance for bandlimited conditions at the cost of some degradation for non-bandlimited conditions. Comparative analysis of microphone independent preprocessing techniques, including cepstral mean normalization, RASTA processing, spectral subtraction and codebook dependent cepstral normalization, reveals the effects and tradeoffs of different compensation techniques.

Full Paper

Bibliographic reference.  Chang, Jane / Zue, Victor W. (1994): "A study of speech recognition system robustness to microphone variations: experiments in phonetic classification", In ICSLP-1994, 995-998.