International Workshop on Spoken Language Translation (IWSLT) 2006

Keihanna Science City, Kyoto, Japan
November 27-28, 2006

Recent Results on MT Evaluation in the GALE Program

Salim Roukos

IBM, Yorktown Heights, NY, USA

We will give an overview of the first year's evaluation results of the GALE program that is based on human post editing of the output of MT systems. A post-editor edits the MT system output until the same meaning is conveyed as in the "Gold" reference. The Human Translation Error Rate (HTER) counts the number of edits performed by a post-editor normalized by the length of the "Gold" reference as the MT error metric. We will report on the sensitivity and stability of the new HTER metric for evaluating MT systems. We also compare the correlation of various automated metrics (BLEU, TER) to HTER.


