Inversion Transduction Grammar for Joint Phrasal Translation Modeling

We present a phrasal inversion transduction grammar as an alternative to joint phrasal translation models. This syntactic model is similar to its flat-string phrasal predecessors, but admits polynomial-time algorithms for Viterbi alignment and EM training. We demonstrate that the consistency constraints that allow flat phrasal models to scale also help ITG algorithms, producing an 80-times faster inside-outside algorithm. We also show that the phrasal translation tables produced by the ITG are superior to those of the flat joint phrasal model, producing up to a 2.5 point improvement in BLEU score. Finally, we explore, for the first time, the utility of a joint phrasal translation model as a word alignment method.

[1]  Wolfgang Macherey,et al.  Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[2]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[3]  Alex Waibel,et al.  The CMU statistical machine translation system , 2003, MTSUMMIT.

[4]  Enrique Vidal,et al.  A Recursive Statistical Translation Model , 2005, ParallelText@ACL.

[5]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[6]  Alexander M. Fraser,et al.  Semi-Supervised Training for Statistical Word Alignment , 2006, ACL.

[7]  I. Dan Melamed,et al.  Multitext Grammars and Synchronous Parsers , 2003, NAACL.

[8]  I. Dan Melamed,et al.  Manual Annotation of Translational Equivalence: The Blinker Project , 1998, ArXiv.

[9]  Daniel Gildea,et al.  Synchronous Binarization for Machine Translation , 2006, NAACL.

[10]  Daniel Gildea,et al.  Syntax-Based Alignment: Supervised or Unsupervised? , 2004, COLING.

[11]  Daniel Gildea,et al.  Stochastic Lexicalized Inversion Transduction Grammar for Alignment , 2005, ACL.

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[14]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[15]  Philipp Koehn,et al.  Constraining the Phrase-Based, Joint Probability Statistical Translation Model , 2006, WMT@HLT-NAACL.

[16]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[17]  Philipp Koehn,et al.  Manual and Automatic Evaluation of Machine Translation between European Languages , 2006, WMT@HLT-NAACL.

[18]  Stephan Vogel,et al.  Considerations in maximum mutual information and minimum classification error training for statistical machine translation , 2005, EAMT.

[19]  Colin Cherry,et al.  A Comparison of Syntactically Motivated Word Alignment Spaces , 2006, EACL.

[20]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[21]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.