5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

A Linguistic and Prosodic Database for Data-Driven Japanese TTS Synthesis

Atsuhiro Sakurai, Takashi Natsume, Keikichi Hirose

Dep. of Information and Communication Engineering, The Univ. of Tokyo, Japan

We propose a method to generate a database that contains a parametric representation of F0 contours associated with linguistic and acoustic information, to be used by data-driven Japanese text-to-speech (TTS) systems. The configuration of the database includes recorded speech, F0 contours and their parametric labels, phonetic transcription with durations, and other linguistic information such as orthographic transcription, part-of-speech (POS) tags, and accent types. All information that is not available by dictionary lookup is obtained automatically. In this paper, we propose a method to automatically obtain parametric labels that describe F0 contours based on a superpositional model. Preliminary tests on a small data set show that the method can find the parametric representation of F0 contours with acceptable accuracy, and that accuracy can be improved by introducing additional linguistic information.

Full Paper

Bibliographic reference.  Sakurai, Atsuhiro / Natsume, Takashi / Hirose, Keikichi (1998): "A linguistic and prosodic database for data-driven Japanese TTS synthesis", In ICSLP-1998, paper 0735.