Speech Prosody 2006

Dresden, Germany
May 2-5, 2006

Prosody Generation in the Speech-to-Speech Translation Framework

Pablo Daniel AgŁero, Jordi Adell, Antonio Bonafonte

TALP Research Center, Universitat PolitŤcnica de Catalunya (UPC), Barcelona, Spain

This paper deals with speech synthesis in the framework of speech-to-speech translation. Our current focus is to translate speeches or conversations between humans so that a third person can listen to them in its own language. In this framework the style is not written but spoken and the original speech includes a lot of non-linguistic information (as speaker emotion). In this work we propose the use of prosodic features in the original speech to produce prosody in the target language. Relevant features are found using an unsupervised clustering algorithm that finds, in a bilingual speech corpus, intonation clusters in the source speech which are relevant in the target speech. Preliminary results already show a significant improvement in the synthetic quality.

Full Paper

Bibliographic reference.  AgŁero, Pablo Daniel / Adell, Jordi / Bonafonte, Antonio (2006): "Prosody generation in the speech-to-speech translation framework", In SP-2006, paper 149.