Speech Prosody 2004

Nara, Japan
March 23-26, 2004

Comparing CART and Fujisaki Intonation Models for Synthesis of US-English Names

Marko Moberg, Kimmo Parssinen

Audio-Visual Systems Laboratory Nokia Research Center, Tampere, Finland

In this work two different speech synthesis intonation models were compared against a reference created with natural intonation. The models chosen were direct classification and regression tree (CART) based pitch estimation and simple implementation of Fujisaki model. The performance and the suitability of the models for low-footprint name synthesis were evaluated by carrying out a listening test. The results of the test indicated that the perceived quality of the intonation generated by the models was equal to the natural intonation reference. Despite the differences in the models they both offer a viable, high quality solution for intonation modeling of US-English names. The results may also apply to other languages and to the case of isolated word synthesis.

Full Paper

Bibliographic reference.  Moberg, Marko / Parssinen, Kimmo (2004): "Comparing CART and fujisaki intonation models for synthesis of US-English names", In SP-2004, 439-442.