Speech Prosody 2006

Dresden, Germany
May 2-5, 2006

Modelling Hesitation for Synthesis of Spontaneous Speech

Rolf Carlson (1), Kjell Gustafson (1,2), Eva Strangert (3)

(1) CSC, Department of Speech, Music and Hearing, KTH, Stockholm, Sweden
(2) Acapela Group Sweden AB, Solna, Sweden
(3) Department of Philosophy and Linguistics, Phonetics, Umeň University, Sweden

The current work deals with the modelling of one type of disfluency, hesitations. A perceptual experiment using speech synthesis was designed to evaluate two duration features found to be correlates to hesitation, pause duration and final lengthening. A variation of F0 slope before the hesitation was also included. The most important finding is that it is the total duration increase that is the valid cue rather than the contribution by either factor. In addition, our findings lead us to assume an interaction with syntax. The absence of strong effects of the induced F0 variation was unexpected and we consider several possible explanations for this result.

Full Paper

Bibliographic reference.  Carlson, Rolf / Gustafson, Kjell / Strangert, Eva (2006): "Modelling hesitation for synthesis of spontaneous speech", In SP-2006, paper 069.