International Workshop on Spoken Language Translation (IWSLT) 2007

Trento, Italy
October 15-16, 2007

MaTrEx: the DCU Machine Translation System for IWSLT 2007

Hany Hassan, Yanjun Ma, Andy Way

School of Computing, Dublin City University, Dublin, Ireland

In this paper, we give a description of the machine translation system devel- oped at DCU that was used for our sec- ond participation in the evaluation cam- paign of the International Workshop on Spoken Language Translation (IWSLT 2007). In this participation, we focus on some new methods to improve sys- tem quality. Specifically, we try our word packing technique for different language pairs, we smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the high number of out of vocabulary items, and finally we deploy a translation-based model for case and punctuation restoration.
We participated in both the classical and challenge tasks for the following translation directions: Chinese–English, Japanese–English and Arabic–English. For the last two tasks, we translated both the single-best ASR hypotheses and the correct recognition results; for Chinese– English, we just translated the correct recognition results. We report the results of the system for the provided evaluation sets, together with some additional ex- periments carried out following identifi- cation of some simple tokenisation errors in the official runs.

Full Paper     Presentation

Bibliographic reference.  Hassan, Hany / Ma, Yanjun / Way, Andy (2007): "Matrex: the DCU machine translation system for IWSLT 2007", In IWSLT-2007, 69-75.