Many applications for text-to-speech synthesis involve the translation of names and/or addresses. However, most commercially available synthesizers are particularly poor at synthesizing names, since the focus of their development is typically synthesizing words. Names in the United States, for example, are often pronounced in accordance with rules different from the rules of English words. The proper pronunciation of names is particularly difficult, since there are so many (America has over 1.5 million different family names) and they derive from dozens of languages.
Bellcore is developing an LPC demisyllable-based synthesizer called "SPOKESMAN," which has been tailored for the synthesis of names and addresses. This paper describes the need for good name pronunciation capabilities and high segmental intelligibility in key applications and provides an overview of SPOKESMAN'S programs. Evaluation tests find that SPOKESMAN has higher segmental intelligibility than commercially available synthesizers and scores higher in preference tests.
Bibliographic reference. Spiegel, Murray F. / Macchi, Marian J. / Gollhardt, Kurt D. (1989): "Synthesis of names by a demisyllable-based speech synthesizer (SPOKESMAN)", In EUROSPEECH-1989, 1117-1120.