International Workshop on Spoken Language Translation (IWSLT) 2008

Honolulu, Hawaii, USA
October 20-21, 2008

The TÜBİTAK-UEKAE Statistical Machine Translation System for IWSLT 2008

Coşkun Mermer, Hamza Kaya, Ömer Farukhan Güneş, Mehmet Uğur Doğan

National Research Institute of Electronics and Cryptology (UEKAE), The Scientific and Technological Research Council of Turkey (TÜBİTAK), Gebze, Kocaeli, Turkey

We present the TÜBİTAK-UEKAE statistical machine translation system that participated in the IWSLT 2008 evaluation campaign. Our system is based on the opensource phrase-based statistical machine translation software Moses. Additionally, phrase-table augmentation is applied to maximize source language coverage; lexical approximation is applied to replace out-of-vocabulary words with known words prior to decoding; and automatic punctuation insertion is improved. We describe the preprocessing and postprocessing steps and our training and decoding procedures. Results are presented on our participation in the classical Arabic-English and Chinese-English tasks as well as the new Chinese-Spanish direct and Chinese-English-Spanish pivot translation tasks.

