International Workshop on Spoken Language Translation (IWSLT) 2010

Paris, France
December 2-3, 2010

Modelling Pronominal Anaphora in Statistical Machine Translation

Christian Hardmeier, Marcello Federico

Fondazione Bruno Kessler, Human Language Technologies, Trento, Italy

Current Statistical Machine Translation (SMT) systems translate texts sentence by sentence without considering any cross-sentential context. Assuming independence between sentences makes it difficult to take certain translation decisions when the necessary information cannot be determined locally. We argue for the necessity to include crosssentence dependencies in SMT. As a case in point, we study the problem of pronominal anaphora translation by manually evaluating German-English SMT output. We then present a word dependency model for SMT, which can represent links between word pairs in the same or in different sentences. We use this model to integrate the output of a coreference resolution system into English-German SMT with a view to improving the translation of anaphoric pronouns.

Full Paper

Bibliographic reference.  Hardmeier, Christian / Federico, Marcello (2010): "Modelling pronominal anaphora in statistical machine translation", In IWSLT-2010, 283-289.