4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Goethe for Prosody

Stefan Rapp

Institut für Maschinelle Sprachverarbeitung (IMS), Universität Stuttgart, Germany

In this paper, we describe the way in which a recording of Goethe’s "Die Leiden des jungen Werther" published on a multimedia CDROM [7] was made accessible for prosody research. The recording is interesting for prosody research because of its prosodic richness as it displays a large variety of registers and speaking styles. Application areas are: development of prosody models for German TTS, unsupervised learning of pitch accent types, corpus search for research on prosody-semantics and prosody-syntax interaction, and the study of more global prosodic parameters (speaking rate, pitch range) defining registers or speaking style. The four hour recording was segmented into phonemes, syllables and words using HMM speech recognition techniques [5, 13] together with a large pronunciation lexicon [1]. A part of speech tagger [14] was applied to the corpus to yield time aligned POS tags. The German adaptation of the tone sequence model of intonation used in Stuttgart [11, 6] inspired the parametrization of fundamental frequency. An intermediate phonetic representation layer is described that uses the syllable alignment to parametrize the F0 contour into a superposition of three functions: a hyperbolic tangent, a gaussian and a constant.

