International Workshop on Spoken Language Translation (IWSLT) 2010
This paper illustrates the ICT Statistical Machine Translation system used in the evaluation campaign of the International Workshop on Spoken Language Translation 2010. We participate in the DIALOG tasks for Chinese-to-English and English-to-Chinese translation respectively. For both tasks, our system has achieved significant improvement with several effective methods as follows: 1) refining the data preprocessing, including Chinese word segmentation, named entity recognition, etc. 2) reducing the number of Out-of- Vocabulary(OOV) on the final test set by applying a fuzzy matching strategy. 3) considering generating a better input for the decoder from the N-best lists of ASR output as a special kind of translation task for the ASR task. 4) improving the performance of every single decoder, and reranking the n-best list for the final results submitted.
Bibliographic reference. Xiong, Hao / Xie, Jun / Yu, Hui / Liu, Kai / Luo, Wei / Mi, Haitao / Liu, Yang / Lü, Yajuan / Liu, Qun (2010): "The ICT statistical machine translation system for IWSLT 2010", In IWSLT-2010, 73-79.