International Workshop on Spoken Language Translation (IWSLT) 2010

Paris, France
December 2-3, 2010

The RWTH Aachen Machine Translation System for IWSLT 2010

Saab Mansour, Stephan Peitz, David Vilar, Joern Wuebker, Hermann Ney

Human Language Technology and Pattern Recognition, Computer Science Department, RWTH Aachen University, Aachen, Germany

In this paper we describe the statistical machine translation system of the RWTH Aachen University developed for the translation task of the IWSLT 2010. This year, we participated in the BTEC translation task for the Arabic to English language direction. We experimented with two state-of-theart decoders: phrase-based and hierarchical-based decoders. Extensions to the decoders included phrase training (as opposed to heuristic phrase extraction) for the phrase-based decoder, and soft syntactic features for the hierarchical decoder. Additionally, we experimented with various rule-based and statistical-based segmenters for Arabic.
   Due to the different decoders and the different methodologies that we apply for segmentation, we expect that there will be complimentary variation in the results achieved by each system. The next step would be to exploit these variations and achieve better results by combining the systems. We try different strategies for system combination and report significant improvements over the best single system.

Full Paper

Bibliographic reference.  Mansour, Saab / Peitz, Stephan / Vilar, David / Wuebker, Joern / Ney, Hermann (2010): "The RWTH aachen machine translation system for IWSLT 2010", In IWSLT-2010, 163-168.