Soft syntactic constraints for Arabic–English hierarchical phrase-based translation

In adding syntax to statistical machine translation, there is a tradeoff between taking advantage of linguistic analysis and allowing the model to exploit parallel training data with no linguistic analysis: translation quality versus coverage. A number of previous efforts have tackled this tradeoff by starting with a commitment to linguistically motivated analyses and then finding appropriate ways to soften that commitment. We present an approach that explores the tradeoff from the other direction, starting with a translation model learned directly from aligned parallel text, and then adding soft constituent-level constraints based on parses of the source language. We argue that in order for these constraints to improve translation, they must be fine-grained: the constraints should vary by constituent type, and by the type of match or mismatch with the parse. We also use a different feature weight optimization technique, capable of handling large amount of features, thus eliminating the bottleneck of feature selection. We obtain substantial improvements in performance for translation from Arabic to English.

[1]  Colin Cherry,et al.  Cohesive Phrase-Based Decoding for Statistical Machine Translation , 2008, ACL.

[2]  Andy Way,et al.  A Syntactified Direct Translation Model with Linear-time Decoding , 2009, EMNLP.

[3]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[4]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[5]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[6]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[7]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[8]  Florence Reeder,et al.  Corpus-based comprehensive and diagnostic MT evaluation: initial Arabic, Chinese, French, and Spanish results , 2002 .

[9]  Philipp Koehn,et al.  CCG Supertags in Factored Statistical Machine Translation , 2007, WMT@ACL.

[10]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[11]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[12]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[13]  Chris Quirk,et al.  Dependency treelet translation: the convergence of statistical and example-based machine-translation? , 2006, MTSUMMIT.

[14]  David Chiang,et al.  Learning to Translate with Source and Target Syntax , 2010, ACL.

[15]  Philipp Koehn,et al.  Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) , 2007 .

[16]  Haizhou Li,et al.  A Tree Sequence Alignment-based Tree-to-Tree Translation Model , 2008, ACL.

[17]  Daniel Jurafsky,et al.  Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks , 2004, NAACL.

[18]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[19]  Dekai Wu,et al.  Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009 , 2009, SSST@HLT-NAACL.

[20]  Qun Liu,et al.  Forest-Based Translation , 2008, ACL.

[21]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[22]  Stefan Riezler,et al.  Grammatical Machine Translation , 2006, NAACL.

[23]  Philipp Koehn,et al.  Proceedings of the Sixth Workshop on Statistical Machine Translation , 2011, WMT@EMNLP.

[24]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[25]  Nizar Habash,et al.  Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment , 2010, Machine Translation.

[26]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[27]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[28]  Stephan Vogel,et al.  Cohesive Constraints in A Beam Search Phrase-based Decoder , 2009, NAACL.

[29]  Taro Watanabe,et al.  Online Large-Margin Training for Statistical Machine Translation , 2007, EMNLP.

[30]  Yuval Marton,et al.  Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models , 2009 .

[31]  Nitin Madnani,et al.  The Hiero Machine Translation System: Extensions, Evaluation, and Analysis , 2005, HLT.

[32]  Alon Lavie,et al.  Decoding with Syntactic and Non-Syntactic Phrases in a Syntax-Based Machine Translation System , 2009, SSST@HLT-NAACL.

[33]  Y. Singer,et al.  Ultraconservative online algorithms for multiclass problems , 2003 .

[34]  Daniel Marcu,et al.  What Can Syntax-Based MT Learn from Phrase-Based MT? , 2007, EMNLP.

[35]  Keith textscHall,et al.  Comparing Reordering Constraints for SMT Using Efficient BLEU Oracle Computation , 2007, HLT-NAACL 2007.

[36]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[37]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[38]  Philipp Koehn,et al.  Noun phrase translation , 2003 .

[39]  Haizhou Li,et al.  A Syntax-Driven Bracketing Model for Phrase-Based Translation , 2009, ACL.

[40]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[41]  Andy Way,et al.  Supertagged Phrase-Based Statistical Machine Translation , 2007, ACL.

[42]  Daniel Marcu,et al.  Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy , 2007, EMNLP.

[43]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[44]  Michael Collins,et al.  A Discriminative Model for Tree-to-Tree Translation , 2006, EMNLP.

[45]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[46]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[47]  Wei Liu,et al.  Efficient Minimal Perfect Hash Language Models , 2010, LREC.

[48]  David Chiang,et al.  Two Easy Improvements to Lexical Weighting , 2011, ACL.

[49]  Noah A. Smith,et al.  Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation , 2009, NAACL.

[50]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[51]  M. Rey,et al.  11 , 001 New Features for Statistical Machine Translation , 2009 .

[52]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[53]  Kevin Knight,et al.  Synchronous Tree Adjoining Machine Translation , 2009, EMNLP.

[54]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[55]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[56]  Christopher D. Manning,et al.  NP Subject Detection in Verb-initial Arabic Clauses , 2009, MTSUMMIT.

[57]  Bowen Zhou,et al.  Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions , 2010, EMNLP.

[58]  Daniel Marcu,et al.  SPMT: Statistical Machine Translation with Syntactified Target Language Phrases , 2006, EMNLP.

[59]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.