Speech Prosody 2010

Chicago, IL, USA
May 10-14, 2010

Integration of Intonation in F0 Trajectory prediction using MSD-HMMs

Xiaojun Zou, Xiao Bao, Lidong Luo

Speech and Hearing Research Center (SHRC), Key Laboratory of Machine Perception, Peking University, Beijing, China

Present study in speech synthesis places more and more emphasis on the spectral continuities and diverse prosodic effects. The trainable HMM-based speech synthesis method tends to generate more continuous spectral structures than the traditional unit selection method. However, the F0 trajectory generated by HMM-based speech synthesis is often excessively smoothed and lacks prosodic variance. This paper proposed an approach to improve the effect of F0 trajectory prediction in mandarin speech synthesis in the framework of multi-space probability distribution HMMs (MSD-HMMs). In the proposed approach, the intonation, which is predicted by context-dependent decision trees, is integrated to the F0 trajectory generated by the MSD-HMMs as a weighted bias term. The experiments indicate that it has an encouraging improvement in the prosodic effectiveness of Mandarin speech synthesis.

Index Terms: Mandarin speech synthesis, MSD-HMMs, Prosody, Intonation, Tone, Register

Full Paper

Bibliographic reference.  Zou, Xiaojun / Bao, Xiao / Luo, Lidong (2010): "Integration of intonation in F0 trajectory prediction using MSD-HMMs", In SP-2010, paper 952.