13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

A Two-stage Speaker Adaptation Approach for Subspace Gaussian Mixture Model based Nonnative Speech Recognition

Bo Li, Khe Chai Sim

School of Computing, National University of Singapore, Singapore

Nonnative speech recognition is becoming more and more important as many speech applications are deployed world wide. Meanwhile, due to the large population of nonnative speakers, speaker adaptation remains the most practical way for providing high performance speech services. Subspace Gaussian Mixture Model (SGMM) has recently been shown to yield superior performance on various native speech recognition tasks. In this paper, we investigated different speaker adaptation techniques of SGMM for nonnative speech recognition. A two-stage direct model adaptation approach has been proposed based on the analysis of SGMM model parameter functionalities. Our initial experiments have also verified that the proposed approach is much more effective than the traditional feature-space Maximum Likelihood Linear Regression(MLLR) on SGMM based nonnative speaker adaptation tasks.

Index Terms: Speaker Adaptation, Nonnative Speech Recognition, Subspace Gaussian Mixture Model

Full Paper

Bibliographic reference.  Li, Bo / Sim, Khe Chai (2012): "A two-stage speaker adaptation approach for subspace Gaussian mixture model based nonnative speech recognition", In INTERSPEECH-2012, 1772-1775.