13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition

Xiangang Li, Dan Su, Zaihu Pang, Xihong Wu

Speech and Hearing Research Center, Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China

In this paper, a probabilistic speaker-class (PSC) based acoustic modeling method is proposed for taking into account speaker variability influence in HMM-based speech recognition systems. Firstly, within the context of speaker-class based speech recognition, an experiment is conducted to investigate the performance of speaker-class recognition based on hard-cut speaker clustering. Then, in the proposed method, through introducing the probabilistic latent speaker analysis, the speaker-class dependent acoustic models are trained based on a softdecision speaker clustering method, and combined by the distribution of speaker-class in the decoding phase. The experiments were conducted on a 600-hour speech corpus, and showed improvement in a large vocabulary continuous speech recognition task.

Index Terms: speech recognition, probabilistic latent speaker analysis, speaker clustering, speaker-class

Full Paper

Bibliographic reference.  Li, Xiangang / Su, Dan / Pang, Zaihu / Wu, Xihong (2012): "Probabilistic speaker-class based acoustic modeling for large vocabulary continuous speech recognition", In INTERSPEECH-2012, 1219-1222.