4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Training Machine Classifiers to Match the Performance of Human Listeners in a Natural Vowel Classification Task

Martin Hunke, Thomas Holton

School of Engineering, San Francisco State University, San Francisco, CA, USA

The purpose of this research is to determine how models of human auditory physiology can improve the performance of automatic speech recognition systems. In this study, a series of experiments was undertaken to discover how humans categorize and confuse vowels in natural speech. The recognition task comprised a large number of vowel nuclei isolated from naturally spoken sentences of a large number of talkers. Machine vowel classifiers were trained to match the results of these vowel categorization experiments using two input feature representations: a spectral-energy feature representation, and a representation derived from an auditory model. Classifiers trained to input representations derived from the auditory model match human performance and are more robust in the presence of noise and spectral filtering than classifiers trained to spectral-energy representations.

Full Paper

Bibliographic reference.  Hunke, Martin / Holton, Thomas (1996): "Training machine classifiers to match the performance of human listeners in a natural vowel classification task", In ICSLP-1996, 574-577.