Third ESCA/COCOSDA Workshop on Speech Synthesis

November 26-29, 1998
Jenolan Caves House, Blue Mountains, NSW, Australia

Efficient Adaptation of TTS Duration Model to New Speakers

Chilin Shih (1), Wentao Gu (2), Jan P. H. van Santen (1)

(1) Bell Labs, Lucent Technologies, Murray Hill, NJ, USA
(2) Shanghai Jiaotong University, China

This paper discusses a methodology using a minimal set of sentences to adapt an existing TTS duration model to capture interspeaker variations. The assumption is that the original duration database contains information of both language-specific and speaker-specific duration characteristics. In training a duration model for a new speaker, only the speaker-specific information needs to be modeled, therefore the size of the training data can be reduced drastically. Results from several experiments are compared and discussed.

Bibliographic reference.  Shih, Chilin / Gu, Wentao / Santen, Jan P. H. van (1998): "Efficient adaptation of TTS duration model to new speakers", In SSW3-1998, 105-110.