First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Improvement of the Synthetic Speech Quality of the Formant-type Speech Synthesizer and Its Subjective Evaluation

Norio Higuchi, Hisashi Kawai, Tohru Shimizu, Seiichi Yamamoto

KDD R&D Laboratories, Saitama, Japan

The authors have recently improved the synthetic speech quality of the Japanese speech synthesizer, which was developed for a special-purpose word processor named "Pasokon Talk" three years ago. The peculiarities of this system were using phonemes as synthesis units and generating all acoustic parameters based on production rules.

The major differences between the previous and current systems concern: (1) the method for control of the voice fundamental frequency contour, especially the phrase component of the generation model for the voice fundamental frequency contour proposed by Fujisaki, (2) the method for control of the formant frequencies and formant bandwidths, and (3) the characteristics of the voicing source.

In order to verify the improvement of synthetic speech quality quantitatively, (1) intelligibility tests of Japanese syllables and (2) opinion tests of naturalness have been performed. The results of comparative subjective evaluation tests show that the synthetic speech of the current system has an almost equal intelligibility and a much better grade of naturalness in comparison to that of the previous system.

Full Paper

Bibliographic reference.  Higuchi, Norio / Kawai, Hisashi / Shimizu, Tohru / Yamamoto, Seiichi (1990): "Improvement of the synthetic speech quality of the formant-type speech synthesizer and its subjective evaluation", In ICSLP-1990, 797-800.