Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Robust Feature Space Adaptation for Telephony Speech Recognition

Xin Lei (1), Jon Hamaker (2), Xiaodong He (2)

(1) University of Washington, USA; (2) Microsoft Speech & Natural Language Group, USA

Speaker adaptation is critical for modern speech recognition systems. Due to the computational and multi-channel model sharing considerations, the use of model adaptation techniques is limited in telephony speech recognition systems. On the other hand, feature space adaptation methods such as feature space maximum likelihood linear regression (fMLLR) are efficient approaches suitable for telephony systems. In this work, we first describe techniques for efficient implementation of online fMLLR adaptation. Then feature space maximum a posteriori linear regression (fMAPLR) is proposed to incorporate prior knowledge for the feature transform estimation and improve the robustness of the conventional fMLLR approach. Experiments on telephony data indicate that fMAPLR is significantly more robust than fMLLR, and outperforms fMLLR especially when the adaptation data is very limited.

Full Paper

Bibliographic reference.  Lei, Xin / Hamaker, Jon / He, Xiaodong (2006): "Robust feature space adaptation for telephony speech recognition", In INTERSPEECH-2006, paper 1743-Tue1A2O.2.