International Workshop on Spoken Language Translation (IWSLT) 2011
San Francisco, CA, USA
This paper describes NICT's participation in the IWSLT 2011
evaluation campaign for the TED speech translation Chinese-
English shared-task. Our approach was based on a phrasebased
statistical machine translation system that was augmented
in two ways.
Firstly we introduced rule-based re-ordering constraints on the decoding. This consisted of a set of rules that were used to segment the input utterances into segments that could be decoded almost independently. This idea here being that constraining the decoding process in this manner would greatly reduce the search space of the decoder, and cut out many possibilities for error while at the same time allowing for a correct output to be generated. The rules we used exploit punctuation and spacing in the input utterances, and we use these positions to delimit our segments. Not all punctuation/ spacing positions were used as segment boundaries, and the set of used positions were determined by a set of linguistically-based heuristics.
Secondly we used two heterogeneous methods to build the translation model, and lexical reordering model for our systems. The first method employed the popular method of using GIZA++ for alignment in combination with phraseextraction heuristics. The second method used a recentlydeveloped Bayesian alignment technique that is able to perform both phrase-to-phrase alignment and phrase pair extraction within a single unsupervised process. The models produced by this type of alignment technique are typically very compact whilst at the same time maintaining a high level of translation quality. We evaluated both of these methods of translation model construction in isolation, and our results show their performance is comparable. We also integrated both models by linear interpolation to obtain a model that outperforms either component. Finally, we added an indicator feature into the log-linear model to indicate those phrases that were in the intersection of the two translation models. The addition of this feature was also able to provide a small improvement in performance.
Bibliographic reference. Finch, Andrew / Goh, Chooi-Ling / Neubig, Graham / Sumita, Eiichiro (2011): "The NICT translation system for IWSLT 2011", In IWSLT-2011, 49-56.