EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Smooth Contour Estimation in Data-Driven Pitch Modelling

Kim E. A. Silverman (1), Jerime R. Bellegarda (1), Kevin A. Lenzo (2)

(1) Apple Computer, USA; (2) Carnegie-Mellon University, USA

Apple's next-generation text-to-speech system in MacOS X uses a superpositional pitch model, comprising a relatively smooth underlying F0 contour and a separate contribution from the influence of the phonetic segments. This paper focuses on the data-driven modelling of the underlying contour, based on electroglottographic signals obtained from a corpus of reiterant speech. F0 extraction from such signals leads to more accurate characteristic shapes, as objectively illustrated by a typically low mean absolute frequency deviation (between 2 and 3 Hz) between original and synthetic F0 contours. This in turn supports a better (both more complete and more realistic) model of F0 behavior. Experimental results illustrate the improved prosodic representation resulting from this F0 model.

Full Paper

Bibliographic reference.  Silverman, Kim E. A. / Bellegarda, Jerime R. / Lenzo, Kevin A. (2001): "Smooth contour estimation in data-driven pitch modelling", In EUROSPEECH-2001, 1167-1170.