EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Unsupervised Noisy Environment Adaptation Algorithm Using MLLR and Speaker Selection

Miichi Yamada (1), Akira Baba (2), Shinichi Yoshizawa (2), Yuichiro Mera (1), Akinobu Lee (1), Hiroshi Saruwatari (1), Kiyohiro Shikano (1)

(1) Nara Institute of Science and Technology, Japan
(2) Laboratories of Image Information Science and Technology, Japan

An unsupervised acoustic model adaptation algorithm using MLLR and speaker selection for noisy environments is proposed. The proposed algorithm requires only one arbitrary utterance and environmental noise data. The adaptation procedure is composed of the following four steps. (1) Speaker selection from a large number of database speakers is carried out using GMM speaker models based on one arbitrary utterance. (2) Initial speaker adapted HMM acoustic models are calculated from the HMM sufficient statistics of the selected speakers, where the sufficient HMM statistics are pre-calculated and stored. (3) A small subset of the clean speech database from the selected speakers and the environment noise data are superimposed. (4) MLLR adaptation is carried out using the noise-superimposed speech database from the selected speakers. The proposed algorithm is evaluated in a 20k vocabulary dictation task for newspaper in noisy environments. We attain 85.7% word correct rate in 25dB SNR, which is slightly better than the matched model by the E-M training using noise superimposed whole speech database. The proposed algorithm is also 7% better than the HMM composition algorithm.

Full Paper

Bibliographic reference.  Yamada, Miichi / Baba, Akira / Yoshizawa, Shinichi / Mera, Yuichiro / Lee, Akinobu / Saruwatari, Hiroshi / Shikano, Kiyohiro (2001): "Unsupervised noisy environment adaptation algorithm using MLLR and speaker selection", In EUROSPEECH-2001, 869-872.