Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

A Discriminative Filter Bank Model for Speech Recognition

Alain Biem (1), Erik McDermott (1), Shigeru Katagiri (2)

(1) ATR Human Information Processing Research Laboratories Soraku-gun, Kyoto, Japan
(2) ATR Interpreting Telecommunications Research Laboratories Soraku-gun, Kyoto, Japan

This paper investigates the realization of a filter bank model that achieves minimum classification error. A bank-of-filter feature extractor module is jointly optimized with the classifier's parameters so as to minimize the errors occurring at the back-end classifier, in the framework of Minimum Classification Error /Generalized Probabilistic Descent Method (MCE/GPD). The method was first applied to readjusting various parameters of filter banks linearly spaced on the Mel-scale for the Japanese vowel recognition task. Analysis of the feature extraction process shows how those parts of the spectrum that are relevant to discrimination are captured. Then the method was applied to a multi-speaker word recognition system, which resulted in an word error rate reduction of more than 20 %.

Full Paper

Bibliographic reference.  Biem, Alain / McDermott, Erik / Katagiri, Shigeru (1995): "A discriminative filter bank model for speech recognition", In EUROSPEECH-1995, 545-548.