Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Improving the Performance of HMM-Based Voice Conversion Using Context Clustering Decision Tree and Appropriate Regression Matrix Format

Long Qin, Yi-Jian Wu, Zhen-Hua Ling, Ren-Hua Wang

University of Science & Technology of China, China

To improve the performance of the HMM-based voice conversion system in which the LSP coefficient is introduced as the spectral representation, a model clustering technique to tie HMMs into classes for the model adaptation, considering the phonetic and linguistic contextual factors of HMMs, is adopted in this paper. Besides, due to the relationship between the LSP coefficients of adjacent orders, an appropriate format of the regression matrix is suggested according to the small amount of the adaptation training data. Subjective and objective tests prove that the source HMMs can be adapted more accurately using the proposed method, meanwhile the synthetic speech generated from the adapted model has better discrimination and speech quality.

Full Paper

Bibliographic reference.  Qin, Long / Wu, Yi-Jian / Ling, Zhen-Hua / Wang, Ren-Hua (2006): "Improving the performance of HMM-based voice conversion using context clustering decision tree and appropriate regression matrix format", In INTERSPEECH-2006, paper 1105-Thu1BuP.1.