Symposium on Machine Learning in Speech and Language Processing (MLSLP)
Portland, Oregon, USA
A conventional approach for noise robust automatic speech recognition consists of using a speech enhancement before recognition. However, speech enhancement cannot completely remove noise, thus a mismatch between the enhanced speech and the acoustic model inevitably remains. Uncertainty decoding approaches have been used to mitigate such a mismatch by accounting for the feature uncertainty during decoding. We have proposed dynamic variance adaptation to estimate the feature uncertainty given adaptation data by maximization of likelihood or discriminative criterion such as MMI. For unsupervised adaptation, the transcriptions are obtained from a first recognition pass and thus contain errors. Such errors are fatal when using a discriminative criterion. In this paper, we investigate the recently proposed differenced MMI discriminative criterion for unsupervised dynamic variance adaptation, because it inherently includes a mechanism to mitigate the influence of errors in the transcriptions.
Index Terms: Robust speech recognition, dynamic variance adaptation, unsupervised adaptation, discriminative training, dMMI
Bibliographic reference. Delcroix, Marc / Ogawa, Atsunori / Nakatani, Tomohiro / Nakamura, Atsushi (2012): "Dynamic variance adaptation using differenced maximum mutual information", In MLSLP-2012, 9-12.