The maximum likelihood linear regression (MLLR) approach for speaker adaptation of continuous density mixture Gaussian HMMs is presented and its application to static and incremental adaptation for both supervised and unsupervised modes described. The approach involves computing a transformation for the mixture component means using linear regression. The transformations are shared between a number of mixture components so that adaptation can be effective with large vocabulary systems which employ a very large number of parameters, using only modest amounts of adaptation data. Results are given for unsupervised incremental adaptation of native speaker Wall Street Journal (WSJ) data, and static-supervised adaptation for non-native speakers. Both show the effectiveness and flexibility of the MLLR approach.
Bibliographic reference. Leggetter, C. J. / Woodland, Phil C. (1995): "Flexible speaker adaptation for large vocabulary speech recognition", In EUROSPEECH-1995, 1155-1158.