13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Phoneme Class Based Adaptation For Mismatch Acoustic Modeling of Distant Noisy Speech

Seçkin Uluskan, John H. L. Hansen

Center for Robust Speech Systems, University of Texas at Dallas, Richardson, TX, USA

A new adaptation strategy for distant noisy speech is created by phoneme class based approaches for context-independent acoustic models. Unlike the previous approaches such as MLLR-MAP adaptation which adapts acoustic model to the features, our phoneme-class based adaptation (PCBA) adapts the distant data features to our acoustic model which has trained on close microphone TIMIT sentences. The essence of PCBA is to create a transformation strategy which makes the distribution of phoneme-classes of distant noisy speech be similar to those of close microphone acoustic model in thirteen dimensional MFCC space (mostly in c0-c1 plane). It creates a mean, orientation and variance adaptation scheme for each phoneme class to compensate the mismatch. New adapted features, and new and improved acoustic models which are produced by PCBA are outperforming those created by MLLR-MAP adaptation for ASR and KWS. And PCBA offers a new powerful understanding in acoustic-modeling of distant speech.

Index Terms: phoneme class, distant noisy speech, mismatch acoustic modeling, feature adaptation

Full Paper

Bibliographic reference.  Uluskan, Seçkin / Hansen, John H. L. (2012): "Phoneme class based adaptation for mismatch acoustic modeling of distant noisy speech", In INTERSPEECH-2012, 1780-1783.