EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Evaluation on Unsupervised Speaker Adaptation Based on Sufficient HMM Statictics of Selected Speakers

Shinichi Yoshizawa (1), Akira Baba (2), Kanako Matsunami (3), Yuichirou Mera (3), Miichi Yamada (3), Akinobu Lee (3), Kiyohiro Shikano (3)

(1) Matsushita Electric Industrial Co., Japan
(2) Laboratories of Image Information Science and Technology, Japan
(3) Nara Institute of Science and Technology, Japan

This paper describes an efficient method of unsupervised speaker adaptation. This method is based on (1) selecting a subset of speakers who are acoustically close to a test speaker, and (2) calculating adapted model parameters according to the previously stored sufficient statistics of the selected speakers' data. In this method, only a few unsupervised test speaker's data are necessary for the adaptation. Also, by using the sufficient HMM statistics of the selected speakers' data, a quick adaptation can be done. Compared with a pre-clustering method, the proposed method can obtain a more optimal cluster because the clustering result is determined according to test speaker's data on-line. Experimental results show that the proposed method attains better improvement than MLLR from the speaker-independent model. The proposed method is evaluated in details and discussed.

Full Paper

Bibliographic reference.  Yoshizawa, Shinichi / Baba, Akira / Matsunami, Kanako / Mera, Yuichirou / Yamada, Miichi / Lee, Akinobu / Shikano, Kiyohiro (2001): "Evaluation on unsupervised speaker adaptation based on sufficient HMM statictics of selected speakers", In EUROSPEECH-2001, 1219-1222.