Intonation: Theory, Models, and Applications

Athens, Greece
September 18-20, 1997


Generating F0 Contours for Speech Synthesis Using the Tilt Intonation Theory

Kurt Dusterhoff, Alan W. Black

Centre for Speech Technology Research, University of Edinburgh, Edinburgh, UK

This paper presents a method for generating F0 contours for a speech synthesis system using the Tilt intonation theory ([10], [9]). The Tilt theory offers an abstract description of natural F0 contours which may be derived automatically from natural speech. Given a speech database labelled with Tilt events, this paper shows how that data may be used to train a model which can adequately predict Tilt parameters from features available in a text to speech system and hence produce natural sounding F0 contours. After a short description of the Tilt theory, the database used and the necessary features used to generate the parameters are presented. For comparison, this work is contrasted with a previous similar experiment on the same database using the ToBI intonation labelling system [2]. The Tilt method not only produces better results (RMSE 32.5 and correlation 0.60) but as it offers automatic labelling of data, it promises the ability to more easily train from general speech databases.

