Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Unsupervised Speaker Normalization by Speaker Markov Model Converter for Speaker-Independent Speech Recognition

Pascale Fung, Tatsuya Kawahara, Shuji Doshita

Department of Information Science, Kyoto University, Sakyo-ku, Kyoto, Japan

We present a new speaker normalization method by which new speaker (NS) data are converted into data similar to the reference speaker (RS) utterance. The Speaker Markov Model Converter (SMMC) converts input NS spectrum data into RS label sequence, which is passed directly to a Hidden Markov Model recognition system. The Converter parameters are estimated from NS spectrum DP-aligned with RS spectrum and RS label stream. The training of the Converter is done using NS input test data and the original RS training data, by this we achieve an unsupervised normalization process. Converter training which includes parameter estimation and improvement is in parallel with the recognition process. Iterations are performed to improve the Converter. HMM score thresholding, template matching and DP thresholding techniques are applied to select suitable data for unsupervised mapping of NS and RS data.

Full Paper

Bibliographic reference.  Fung, Pascale / Kawahara, Tatsuya / Doshita, Shuji (1991): "Unsupervised speaker normalization by speaker Markov model converter for speaker-independent speech recognition", In EUROSPEECH-1991, 1111-1114.