Improving Phrase-Based Statistical Translation Through Combination of Word Alignments

This paper investigates the combination of word-alignments computed with the competitive linking algorithm and well-established IBM models. New training methods for phrase-based statistical translation are proposed, which have been evaluated on a popular traveling domain task, with English as target language, and Chinese, Japanese, Arabic and Italian as source languages. Experiments were performed with a highly competitive phrase-based translation system, which ranked at the top in the 2005 IWSLT evaluation campaign. By applying the proposed techniques, even under very different data-sparseness conditions, consistent improvements in BLEU and NIST scores were obtained on all considered language pairs.

[1]  Robert C. Moore Association-Based Bilingual Word Alignment , 2005, ParallelText@ACL.

[2]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[3]  Marcello Federico,et al.  A word-to-phrase statistical translation model , 2005, TSLP.

[4]  Mauro Cettolo,et al.  The ITC-irst SMT system for IWSLT 2006 , 2006, IWSLT.

[5]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[6]  I. Dan Melamed,et al.  Models of translation equivalence among words , 2000, CL.

[7]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[8]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[9]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[10]  Boxing Chen,et al.  Combining clues for lexical level aligning using the Null hypothesis approach , 2004, COLING.

[11]  Colin Cherry,et al.  A Probability Model to Improve Word Alignment , 2003, ACL.

[12]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[13]  Eiichiro Sumita,et al.  Creating corpora for speech-to-speech translation , 2003, INTERSPEECH.

[14]  Chiori Hori,et al.  Overview of the IWSLT 2005 Evaluation Campaign , 2005, IWSLT.

[15]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[16]  Hermann Ney,et al.  Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation , 2003, CL.

[17]  Noriko Kando,et al.  Overview of the IWSLT04 evaluation campaign , 2004, IWSLT.