Fourth ISCA ITRW on Speech Synthesis

August 29 - September 1, 2001
Perthshire, Scotland

A Discourse Model for Pitch-Range Control

Gregor Möhler and Jörg Mayer

Institute of Natural Language Processing, University of Stuttgart, Germany

The width and the position of the pitch range reveals important information about the structure of a spoken discourse. This paper studies the correlation between the pitch range and the discourse structure based on a large database. The model used to analyze the discourse is based on a two-level description of registers. Frimary register features reflect the prosodic phrasing within a discourse segment. The secondary register features depend on the relations between the discourse segments, more specifically the topic structure of the discourse. The pitch-range is automatically extracted from a speech database with the help of an F0 parametrization. This study shows that different registers exhibit pitch range values that differ clearly in position and width. These results can be used to successfully implement a global prominence model within a speech synthesis system. The ideal application is concept-to-speech, where discourse in- formation is in principle available on the input side.

