Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

A Study of Speaker Adaptation Based on Minimum Classification Error Training

Tomoko Matsui, Sadaoki Furui

NTT Human Interface Laboratories, Musashino-Shi, Tokyo, Japan

This paper studies a speaker adaptation method based on minimum-classification-error (MCE) training applied to a hidden Markov model (HMM) based speaker-independent speech recognition system. In this method, the HMM parameters are adapted to a new speaker using the combination of maximum a posteriori (MAP) and MCE estimation. MAP estimation maximizes the a posteriori probability that the HMMs generate the data of the speaker, but this does not always guarantee the highest performance for reducing the recognition error. On the other hand, MCE estimation directly aims at minimizing the recognition error and has recently started to be used for speaker adaptation. Here MCE estimation is applied to the HMM parameters adapted by MAP estimation so that they fall into one of the local minima near the HMM parameters adapted by MAP estimation.

In phoneme recognition experiments, we compare the performance of our combination of MAP and MCE estimation against the individual performances of MAP and MCE estimation. We find that the combination of MAP and MCE estimation is the most effective.

Full Paper

Bibliographic reference.  Matsui, Tomoko / Furui, Sadaoki (1995): "A study of speaker adaptation based on minimum classification error training", In EUROSPEECH-1995, 81-84.