Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

A Successive State and Mixture Splitting for Optimizing the Size of Models in Speech Recognition

Soo-Young Suk (1), Seong-Jun Hahm (2), Ho-Youl Jung (2), Hyun-Yeol Chung (2)

(1) AIST, Japan; (2) Yeungnam University, Korea

A Successive State and Mixture Splitting (SSMS) algorithm for optimizing the size of models used in speech recognition for small size of mobile devices is proposed in this paper. The proposed algorithm employs essentially Continuous Hidden Markov Model (CHMM) structure and this CHMM consists of variable parameter topology in order to minimize the number of model parameters and to reduce recognition time. SSMS splits the Gaussian Output Probability Density Distribution (GOPDD) for variable parameter context independent model. Unlike the Successive State Splitting generating context dependent model, the algorithm constructs context independent model with suitable number of states and mixtures for each recognition units by automatic splitting of GOPDD in time and mixture domain. The recognition results showed that the proposed SSMS could reduce the total number of Gaussian up to 40.0% compared with the fixed parameter models at the same performance in speech recognition.

Full Paper

Bibliographic reference.  Suk, Soo-Young / Hahm, Seong-Jun / Jung, Ho-Youl / Chung, Hyun-Yeol (2006): "A successive state and mixture splitting for optimizing the size of models in speech recognition", In INTERSPEECH-2006, paper 2022-Mon3BuP.11.