This paper reports the results of an acoustic study of Mandarin Chinese that was carried out for the AT&T Mandarin text-to-speech system. We present the optimal classification of vowels for the purpose of the synthesizer, and discuss some coarticulation effects and their implications for the collection of acoustic inventory elements. We are able to achieve excellent speech quality with a diphone-based concatenative system.
Bibliographic reference. Shih, Chilin (1995): "Study of vowel variations for a Mandarin speech synthesizer", In EUROSPEECH-1995, 1807-1810.