International Workshop on Spoken Language Translation (IWSLT) 2007
This paper describes the University of Maryland statistical machine translation system used in the IWSLT 2007 evaluation. Our focus was threefold: using hierarchical phrasebased models in spoken language translation, the incorporation of sub-lexical information in model estimation via morphological analysis (Arabic) and word and character segmentation (Chinese), and the use of n-gram sequence models for source-side punctuation prediction. Our efforts yield significant improvements in Chinese-English and Arabic-English translation tasks for both spoken language and human transcription conditions.
Bibliographic reference. Dyer, Christopher J. (2007): "The university of maryland translation system for IWSLT 2007", In IWSLT-2007, 180-185.