Speech Prosody 2010
Chicago, IL, USA
Present study in speech synthesis places more and more emphasis on the spectral continuities and diverse prosodic effects. The trainable HMM-based speech synthesis method tends to generate more continuous spectral structures than the traditional unit selection method. However, the F0 trajectory generated by HMM-based speech synthesis is often excessively smoothed and lacks prosodic variance. This paper proposed an approach to improve the effect of F0 trajectory prediction in mandarin speech synthesis in the framework of multi-space probability distribution HMMs (MSD-HMMs). In the proposed approach, the intonation, which is predicted by context-dependent decision trees, is integrated to the F0 trajectory generated by the MSD-HMMs as a weighted bias term. The experiments indicate that it has an encouraging improvement in the prosodic effectiveness of Mandarin speech synthesis.
Index Terms: Mandarin speech synthesis, MSD-HMMs, Prosody, Intonation, Tone, Register
Bibliographic reference. Zou, Xiaojun / Bao, Xiao / Luo, Lidong (2010): "Integration of intonation in F0 trajectory prediction using MSD-HMMs", In SP-2010, paper 952.