International Workshop on Spoken Language Translation (IWSLT) 2010

Paris, France
December 2-3, 2010

MorphTagger: HMM-Based Arabic Segmentation for Statistical Machine Translation

Saab Mansour

Human Language Technology and Pattern Recognition, Computer Science Department, RWTH Aachen University, Aachen, Germany

In this paper, we investigate different methodologies of Arabic segmentation for statistical machine translation by comparing a rule-based segmenter to different statistically-based segmenters. We also present a new method for segmentation that serves the need for a real-time translation system without impairing the translation accuracy.

Full Paper

Bibliographic reference.  Mansour, Saab (2010): "Morphtagger: HMM-based Arabic segmentation for statistical machine translation", In IWSLT-2010, 321-327.