First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

A Rule-Based Speech Synthesizer Using Pitch Controlled Residual Wave Excitation Method

Kazuhiko Iwata (1), Yukio Mitome (1), Jun Kametani (1), Minoru Akamatsu (2), Seimitsu Tomotake (2), Kazunori Ozawa (1), Takao Watanabe (1)

(1) NEC Corporation, Kawasaki, Japan; (2) NEC Engineering, Kawasaki, Japan

A Japanese text-to-speech conversion system has been developed, which can generate highly intelligible and natural synthetic speech from an arbitrary text written in Kanji characters (Chinese ideographs) by concatenating CV (C: consonant, V: vowel) and VC speech units. The system consists of a text analysis system and a speech synthesizer, constructed on compact hardware for a personal computer. To generate high quality synthetic speech, a pitch controlled residual wave excitation method is proposed, which uses residual waves as excitation signals for a synthesis filter in all portions of each speech unit. To realize natural rhythms, a phoneme duration rule has been created, based on statistical analysis of a large speech database. Evaluation experiments for the synthesizer were carried out. Results for the 100 syllable articulation test show an 88.8% accuracy rate and results for the 1,000 phonetically balanced word intelligibility test show a 97.4% accuracy.

