Third ESCA/COCOSDA Workshop on Speech Synthesis

November 26-29, 1998
Jenolan Caves House, Blue Mountains, NSW, Australia

Generating Pitch Accent Distributions that Show Individual and Stylistic Differences

Janet E. Cahn

Massachusetts Institute of Technology, Cambridge, MA, USA

I describe a limited-resource approach to generating prosody that mediates text-based information through a model of attention and working memory, whose simulation parameters are quantitative. The main parameter quantifies recall. Varying it varies what counts as given and new in a text, and therefore, the pitch accents with which the text is uttered. Currently, the system produces prosody in three different styles of read speech - child-like, adult expressive, and knowledgeable - and individual variation within each. A comparison with natural data shows clear and predictable stylistic similarities, although not at significance. However, informal feedback is more forgiving, indicating that the prosody is both natural and expressive for consecutive phrases, but that work is still needed to make this effect consistent throughout the text.

