International Workshop on Spoken Language Translation (IWSLT) 2007

Trento, Italy
October 15-16, 2007

The TÜBİTAK-UEKAE Statistical Machine Translation System for IWSLT 2007

Coşkun Mermer, Hamza Kaya, Mehmet Uğur Doğan

National Research Institute of Electronics and Cryptology (UEKAE), The Scientific and Technological Research Council of Turkey (TÜBİTAK), Gebze, Kocaeli, Turkey

We describe the TÜBITAK-UEKAE system that participated in the Arabic-to-English and Japanese-to-English translation tasks of the IWSLT 2007 evaluation campaign. Our system is built on the open-source phrasebased statistical machine translation software Moses. Among available corpora and linguistic resources, only the supplied training data and an Arabic morphological analyzer are used in the system. We present the run-time lexical approximation method to cope with out-of-vocabulary words during decoding. We tested our system under both automatic speech recognition (ASR) and clean transcript (clean) input conditions. Our system was ranked first in both Arabic-to-English and Japanese-to-English tasks under the “clean” condition.

Full Paper     Presentation

Bibliographic reference.  Mermer, Coşkun / Kaya, Hamza / Doğan, Mehmet Uğur (2007): "The tÜbİTAK-UEKAE statistical machine translation system for IWSLT 2007", In IWSLT-2007, 176-179.