Sixth ISCA Workshop on Speech Synthesis
Given that state of the art speech synthesis systems have already reached a high naturalness level, it is time to move to talking speech from the actual read speech framework. For this purpose it is thus necessary to investigate how disfluencies can be included in speech synthesis and even increase its naturalness. This paper builds on a previously presented work and focuses on finding a local model of filled pauses rhythm. A statistical study of rhythm effects around filled pauses is presented and based on the correlation between rhythm variables, a regression model is proposed to predict filled pauses duration and prepausal lengthening.
Bibliographic reference. Adell, Jordi / Bonafonte, Antonio / Escudero, David (2007): "Statistical analysis of filled pauses² rhythm for disfluent speech synthesis", In SSW6-2007, 223-227.