The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis
Nowadays modern speech technologies need to be flexible and adaptable to any framework. Mass media globalization introduces the challenge of multilingualism into most popular speech applications such as text-to-speech synthesis and automatic speech recognition. Mixed-language texts vary in their nature and when processed, some essential characteristics ought to be considered. In Spain, the usage of English and other foreign origin words is growing as well as in other countries. The particularity of the peninsular Spanish is that there is a tendency to nativized foreign words pronunciation so that they fit in properly into Spanish phonetics. In this work our goal was to approach the nativization challenge by data-driven methods, since they are transferable to other languages and do not yield in performance. Training and test corpora for nativization were manually crafted and the experiments were carried out using pronunciation by analogy. The results obtained were encouraging and proved that even a small training corpus of 1000 words allows obtaining a higher level of intelligibility for English inclusions in Spanish utterances.
Index Terms: nativization, grapheme-to-phoneme conversion, phoneme-to-phoneme conversion, Spanish TTS, pronunciation by analogy
Bibliographic reference. Polyákova, Tatyana / Bonafonte, Antonio (2010): "Nativization of English words in Spanish using analogy", In SSW7-2010, 294-299.