13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Gaussian Map based Acoustic Model Adaptation Using Untranscribed Data for Speech Recognition in Severely Adverse Environments

Wooil Kim, John H. L. Hansen

Center for Robust Speech Systems (CRSS), Erik Jonsson School of Engineering & Computer Science University of Texas at Dallas, Richardson, TX, USA

This study proposes an acoustic model adaptation scheme to improve speech recognition in severely adverse environments utilizing untranscribed data. In the proposed method, a clean GMM is estimated from clean training data, and a noise corrupted GMM is obtained by MAP adaptation over the adaptation data. The Gaussian component of the adapted HMMs is obtained using the transform of the most similar Gaussian component of the GMM. The proposed mixture-selective model adaptation method is evaluated using an LDC corpus which represents severely adverse communication channel environments. The experimental results show the proposed adaptation method is comparable or improves performance compared to conventional MLLR adaptation. The proposed method is also effective at improving speech recognition using independent adaptation data sets. Performance results demonstrate that the proposed adaptation method is significantly more effective at improving speech recognition in severely noise conditions, where transcribed data is unavailable and baseline ASR fails to accurately transcribe the adaptation data due to acoustic condition mismatch.

Index Terms: model adaptation, untranscribed data, Gaussian mapping, adverse environments, robust speech recognition

Full Paper

Bibliographic reference.  Kim, Wooil / Hansen, John H. L. (2012): "Gaussian map based acoustic model adaptation using untranscribed data for speech recognition in severely adverse environments", In INTERSPEECH-2012, 1764-1767.