International Workshop on Spoken Language Translation (IWSLT) 2008
Honolulu, Hawaii, USA
We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year's evaluation, we ported the Syntax Augmented MT toolkit  to the Hadoop MapReduce  parallel processing architecture, allowing us to efficiently run experiments evaluating a novel wider pipelines approach to integrate evidence from N-best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.
Full Paper Presentation (pdf)
Bibliographic reference. Zollmann, Andreas / Venugopal, Ashish / Vogel, Stephan (2008): "The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments", In IWSLT-2008, 18-25.