4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Synthesizing Dialogue Speech of Japanese Based on the Quantitative Analysis of Prosodic Features

Keikichi Hirose, Mayumi Sakata, Hiromichi Kawanami

Department of Information and Communication Engineering, Faculty of Engineering, University of Tokyo, Tokyo, Japan

Through the analyses of fundamental frequency contours and speech rates of dialogue speech and also of read speech, prosodic rules were derived for the synthesis of spoken dialogue. As for the fundamental frequency contours, they were first decomposed into phrase and accent components based on the superpositional model, and then their command magnitudes/amplitudes were analyzed by the method of multiple regression analysis. As for the speech rate, the reduction rate of mora duration from reading-style to dialogue-style was calculated. After normalizing the sentence length, the mean reduction rate was calculated as an average over utterances without complicated syntactic structure. Results of the above analyses were incorporated in the prosodic rules for dialog speech synthesis. Using a formerly developed formant speech synthesizer, synthesis was conducted using both the former rules of read speech and the newly developed rules. A hearing test showed that the new rules can produce better prosody as dialogue speech.

