Speech Prosody 2010

Chicago, IL, USA
May 10-14, 2010

Incorporation of Excitation Source and Duration Variations in Speech Synthesized at Different Speaking Rates

M. Sri Harish Reddy, Bayya Yegnanarayana

International Institute of Information Technology, Hyderabad, India

The effect of speaking rate on the excitation source is examined using instantaneous fundamental frequency (F0) and perceived loudness (η). The instantaneous F0 and η seem to increase in the case of normal to fast speech, where as they are speakerspecific for the case of normal to slow speech. The study on duration variations of voiced, unvoiced and silence segments show that the duration changes are not uniform when speaking rate is varied. These observed variations in the excitation source and durations are incorporated in the epoch-based duration modification method. Perceptual studies show that these variations are significant for the perception of speaking rate.

Index Terms: speaking rate, duration modification, excitation source feature

Full Paper

Bibliographic reference.  Reddy, M. Sri Harish / Yegnanarayana, Bayya (2010): "Incorporation of excitation source and duration variations in speech synthesized at different speaking rates", In SP-2010, paper 725.