5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Heterogeneous Measurements and Multiple Classifiers for Speech Recognition

Andrew K. Halberstadt, James R. Glass

MIT Laboratory for Computer Science, USA

This paper addresses the problem of acoustic phonetic modeling. First, heterogeneous acoustic measurements are chosen in order to maximize the acoustic-phonetic information extracted from the speech signal in preprocessing. Second, classifier systems are presented for successfully utilizing high-dimensional acoustic measurement spaces. The techniques used for achieving these two goals can be broadly categorized as hierarchical, committee-based, or a hybrid of these two. This paper presents committee-based and hybrid approaches. In context-independent classification and context-dependent recognition on the TIMIT core test set using 39 classes, the system achieved error rates of 18.3% and 24.4%, respectively. These error rates are the lowest we have seen reported on these tasks. In addition, experiments with a telephone-based weather information word recognition task led to word error rate reductions of 10-16%.

Full Paper

Bibliographic reference.  Halberstadt, Andrew K. / Glass, James R. (1998): "Heterogeneous measurements and multiple classifiers for speech recognition", In ICSLP-1998, paper 0396.