4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

A Method for Estimating Prosodic Symbol from Text for Japanese Text-To-Speech Synthesis

Ken-ichi Magata, Tomoki Hamagami, Mitsuo Komura

SECOM Intelligent Systems Laboratory, SECOM CO.,LTD., Tokyo, Japan

This report describes a method for estimating the separation degree at the bunsetsu boundary (SD) for Japanese text-to-speech synthesis. Our method gives us the prosodic symbol without using complicated linguistic analysis. First we classify bunsetsus according to the final morpheme. Each classified bunsetsu has a temporary separation degree in advance. We call this \the estimated separation degree" (ESD). ESD is derived from the SD's statistical tendency regarding each bunsetsu. The SD is decided by rules that correct the ESD as an initial degree. Correction rules are constructed by comparing the ESD, and the SD is observed from natural speech to cancel the frequently occurring mismatches. An absolute evaluation test of five grades was performed upon 300 sentences with prosodic symbols given by our method. As a result, the ratio of \Natural" and \Somewhat unnatural but tolerable" exceeded 2/3. The proportion of \Serious error" was less than 10%, thus giving us satisfactory results.

Full Paper

Bibliographic reference.  Magata, Ken-ichi / Hamagami, Tomoki / Komura, Mitsuo (1996): "A method for estimating prosodic symbol from text for Japanese text-to-speech synthesis", In ICSLP-1996, 1373-1376.