4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

The Broad Study of Homograph Disambiguity for Mandarin Speech Synthesis

Wern-Jun Wang (1,2), Shaw-Hwa Hwang (2), Sin-Horng Chen (2)

(1) Telecommunication Laboratories, DGT, MOTC, R.O.C.
(2) National Chiao Tung University, R.O.C

How to increase the intelligibility and naturalness of synthetic speech have drawn much attentions in the recent Mandarin text-to-speech(TTS) researches. They have always been treated as bottleneck due to their effects are explicit for human perception. However, as qualities of synthetic speech increase for syllables, words or phrase, there is also an increasing need to improve the various components of the text processing. One of these desired improvements for Mandarin speech synthesis is the accuracy of character-to-sound(CTS) process. From the viewpoint of application, the purpose of speech synthesis should be aimed at making the synthetic speech understandable by human and minimize the misunderstanding between them. It thus is very important to increase the accuracy of CTS process. Such process is designed to predict phonetic pronunciations from a coarse surface text input and the difficulty mainly result from ambiguous homograph characters. In this paper, we proposed some effective analysis method incorporated with linguistic knowledge to resolve homograph ambiguity. The methods we used in the following experiments are discriminating lexical association and tree-based language model. From the experiment results, we can get about 10% more improvement on the average accuracy rate than traditional maximum frequency guess approach for most ambiguous homograph character.

Full Paper

Bibliographic reference.  Wang, Wern-Jun / Hwang, Shaw-Hwa / Chen, Sin-Horng (1996): "The broad study of homograph disambiguity for Mandarin speech synthesis", In ICSLP-1996, 1389-1392.