5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Tempo and Its Change in Spontaneous Speech

Anton Batliner, Andreas Kießling (1), Ralf Kompe (2), Heinrich Niemann, Elmar Nöth

Univ. Erlangen-Nürnberg, Lehrstuhl für Mustererkennung, Erlangen, Germany
(1) now with Ericsson Eurolab, Nürnberg, Germany
(2) now with Sony Stuttgart Technology Center, Fellbach, Germany

In this paper, we give a first account of speech tempo and its change in spontaneous speech in a very large data base (Verbmobil, i.e., human-human appointment dialogs). As features representing speech tempo, we computed mean normalized speech duration (speaking rate) and normalized phone duration in different ways. The importance of these features is evaluated with an automatic classification of boundaries and accents where different sets of prosodic features (including also information about F0, energy, pause, etc.) were used. The best results (83% for accents, 88% for boundaries, two classes each) could be achieved when all features were used. For the 2nd issue change of tempo was labelled manually. We present the characterizing feature values for changes from slow to fast and from fast to slow, as well as the results of an automatic classifcation of change of tempo (72% for three classes). Finally, we discuss the possible function of change of tempo and its use in automatic speech processing.

Bibliographic reference.  Batliner, Anton / Kießling, Andreas / Kompe, Ralf / Niemann, Heinrich / Nöth, Elmar (1997): "Tempo and its change in spontaneous speech", In EUROSPEECH-1997, 763-766.