Word-Based Alignment, Phrase-Based Translation: What’s the Link?

State-of-the-art statistical machine translation is based on alignments between phrases ‐ sequences of words in the source and target sentences. The learning step in these systems often relies on alignments between words. It is often assumed that the quality of this word alignment is critical for translation. However, recent results suggest that the relationship between alignment quality and translation quality is weaker than previously thought. We investigate this question directly, comparing the impact of highquality alignments with a carefully constructed set of degraded alignments. In order to tease apart various interactions, we report experiments investigating the impact of alignments on different aspects of the system. Our results confirm a weak correlation, but they also illustrate that more data and better feature engineering may be more beneficial than better alignment.

[1]  Chris Callison-Burch,et al.  Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora , 2004, ACL.

[2]  Salim Roukos,et al.  A Maximum Entropy Word Aligner for Arabic-English Machine Translation , 2005, HLT.

[3]  Robert C. Moore A Discriminative Framework for Bilingual Word Alignment , 2005, HLT.

[4]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[5]  Necip Fazil Ayan,et al.  Going Beyond AER: An Extensive Analysis of Word Alignments and Their Impact on MT , 2006, ACL.

[6]  Christof Monz,et al.  NeurAlign: Combining Word Alignments Using Neural Networks , 2005, HLT/EMNLP.

[7]  I. Dan Melamed Automatic Construction of Clean Broad-Coverage Translation Lexicons , 1996, AMTA.

[8]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[9]  Philipp Koehn,et al.  Constraining the Phrase-Based, Joint Probability Statistical Translation Model , 2006, WMT@HLT-NAACL.

[10]  Nitin Madnani,et al.  The Hiero Machine Translation System: Extensions, Evaluation, and Analysis , 2005, HLT.

[11]  Ben Taskar,et al.  Max-Margin Parsing , 2004, EMNLP.

[12]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[13]  Ben Taskar,et al.  A Discriminative Matching Approach to Word Alignment , 2005, HLT.

[14]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[15]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[16]  Stephan Vogel,et al.  Considerations in maximum mutual information and minimum classification error training for statistical machine translation , 2005, EAMT.

[17]  Philip Resnik,et al.  Word-level Alignment for Multilingual Resource Acquisition , 2002 .

[18]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[19]  Philip Resnik,et al.  Bootstrapping parsers via syntactic projection across parallel texts , 2005, Natural Language Engineering.

[20]  Hermann Ney,et al.  A Comparison of Alignment Models for Statistical Machine Translation , 2000, COLING.

[21]  Noah A. Smith,et al.  Bilingual Parsing with Factored Estimation: Using English to Parse Korean , 2004, EMNLP.

[22]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[23]  Marc Dymetman,et al.  Translating with Non-contiguous Phrases , 2005, HLT.

[24]  Alexander M. Fraser,et al.  Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation , 2007, CL.

[25]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[26]  Alexander H. Waibel,et al.  Learning a Log-Linear Model with Bilingual Phrase-Pair Features for Statistical Machine Translation , 2005, International Joint Conference on Natural Language Processing.

[27]  John DeNero,et al.  Why Generative Phrase Models Underperform Surface Heuristics , 2006, WMT@HLT-NAACL.

[28]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[29]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[30]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[31]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[32]  Douglas W. Oard,et al.  Improved Cross-Language Retrieval using Backoff Translation , 2001, HLT.

[33]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.