4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
On the basis of the short-time relative speech rate defined by the authors, this paper examines the optimum width of the smoothing window by perceptual experiments on the naturalness of re-synthesized speech. With the optimum window of 270 ms, relative speech rates are obtained both for ‘fast’ and ‘slow’ utterances of the same sentence, using an utterance produced at a ‘normal’ speech rate. The averaged results show that the speech rate control function for an utterance can be approximately decomposed into a global component for each sentence and local components for each bunsetsu and each major syntactic boundary. Based on these results, a scheme is presented for controlling the local speech rate of a reference utterance to obtain a synthetic utterance of an arbitrary global speech rate.
Bibliographic reference. Ohno, Sumio / Fukumiya, Masamichi / Fujisaki, Hiroya (1996): "Quantitative analysis of the local speech rate and its application to speech synthesis", In ICSLP-1996, 2254-2257.