It is now commonplace to use web conferencing technology in order to hold meetings between participants situated in different physical locations. A drawback of this technology is that nearly all of the interaction between these participants is monolingual. Here, we demonstrate a novel form of this technology that enables cross-lingual speech-to-speech communication between conference participants in real time. We model this translation problem as a combination of incremental speech recognition and segmentation, addressing the question of finding which segmentation strategy maximizes translation accuracy while minimizing latency. Our demonstration takes the form of a web conferencing scenario where a presenter speaks in one language while talk participants listen to or read the speaker's translated texts in real time. This system is flexible enough to allow real-time translation of technical talks or speeches covering broad topics.
Bibliographic reference. Chen, John / Wen, Shufei / Sridhar, Vivek Kumar Rangarajan / Bangalore, Srinivas (2013): "Multilingual web conferencing using speech-to-speech translation", In INTERSPEECH-2013, 1861-1863.