International Workshop on Spoken Language Translation (IWSLT) 2007

Trento, Italy
October 15-16, 2007

The TALP N-Gram-based SMT System for IWSLT 2007

Patrik Lambert (1), Marta R. Costa-jussà (1), Josep M. Crego (1), Maxim Khalilov (1), José B. Mariño (1), Rafael E. Banchs (1), José A. R. Fonollosa (1), Holger Schwenk (2)

(1) TALP Research Center, Universitat Politècnica de Catalunya, Barcelona, Spain
(2) LIMSI-CNRS, Orsay, France

This paper describes TALPtuples, the 2007 N-gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Politècnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neural network language model.
   Results on two translation directions are reported, namely from Arabic and Chinese into English, thoroughly explaining all language-related preprocessing and translation schemes.

Full Paper     Presentation

Bibliographic reference.  Lambert, Patrik / Costa-jussà, Marta R. / Crego, Josep M. / Khalilov, Maxim / Mariño, José B. / Banchs, Rafael E. / Fonollosa, José A. R. / Schwenk, Holger (2007): "The TALP n-gram-based SMT system for IWSLT 2007", In IWSLT-2007, 169-175.