Conventional methods for studying the temporal organization of speech generally suffer from inaccuracies in finding segmental boundaries. The present paper proposes a new method that allows one to measure the temporal variations in speech rate of a given target utterance relative to another utterance chosen as reference. By the DP matching procedure commonly used in speech recognition, a warping function is found for mapping the time axis of the target onto that of the reference, and the relative local speech rate is defined as the reciprocal of the slope of this time-axis warping function. Analysis of both Japanese and English utterances has revealed the structure of the local speech rate variations, and has proved the potential utility of the method in speech synthesis of various styles and speech rates.
Bibliographic reference. Ohno, Sumio / Fujisaki, Hiroya (1995): "A method for quantitative analysis of the local speech rate", In EUROSPEECH-1995, 421-424.