String-to-Dependency Statistical Machine Translation

We propose a novel string-to-dependency algorithm for statistical machine translation. This algorithm employs a target dependency language model during decoding to exploit long distance word relations, which cannot be modeled with a traditional n-gram language model. Experiments show that the algorithm achieves significant improvement in MT performance over a state-of-the-art hierarchical string-to-string system on NIST MT06 and MT08 newswire evaluation sets.

[1]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[2]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[3]  Stuart M. Shieber,et al.  Synchronous Tree-Adjoining Grammars , 1990, COLING.

[4]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[5]  David J. Weir,et al.  D-Tree Grammars , 1995, ACL.

[6]  Aravind K. Joshi,et al.  Tree-Adjoining Grammars , 1997, Handbook of Formal Languages.

[7]  Giorgio Satta,et al.  Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars , 1999, ACL.

[8]  David Chiang,et al.  Statistical Parsing with an Automatically-Extracted Tree Adjoining Grammar , 2000, ACL.

[9]  Frederick Jelinek,et al.  Structured language modeling , 2000, Comput. Speech Lang..

[10]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[11]  Ding Yuan,et al.  Natural language generation in the context of machine translation , 2002 .

[12]  Peng Xu,et al.  A Study on Richer Syntactic Dependencies for Structured Language Modeling , 2002, ACL.

[13]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[14]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[16]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[17]  Kenji Yamada,et al.  Syntax-based language models for statistical machine translation , 2003, ACL 2003.

[18]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[19]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[20]  Kevin Knight,et al.  Training Tree Transducers , 2004, NAACL.

[21]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[22]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[23]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[24]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[25]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[26]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[27]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[28]  Aravind K. Joshi,et al.  Incremental LTAG Parsing , 2005, HLT/EMNLP.

[29]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[30]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[31]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[32]  David A. Smith,et al.  Quasi-Synchronous Grammars: Alignment by Soft Projection of Syntactic Dependencies , 2006, WMT@HLT-NAACL.

[33]  Daniel Marcu,et al.  SPMT: Statistical Machine Translation with Syntactified Target Language Phrases , 2006, EMNLP.

[34]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[35]  Daniel Marcu,et al.  What Can Syntax-Based MT Learn from Phrase-Based MT? , 2007, EMNLP.

[36]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[37]  Daniel Marcu,et al.  Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy , 2007, EMNLP.

[38]  Richard M. Schwartz,et al.  Language Model Adaptation in Machine Translation from Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[39]  Xavier Carreras,et al.  TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-Rich Parsing , 2008, CoNLL.

[40]  Aravind K. Joshi,et al.  LTAG Dependency Parsing with Bidirectional Incremental Construction , 2008, EMNLP.

[41]  Wolfgang Macherey,et al.  Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[42]  Libin Shen,et al.  A new resource for incremental, dependency and semantic parsing , 2008 .

[43]  Qun Liu,et al.  Forest-Based Translation , 2008, ACL.

[44]  Xavier Carreras,et al.  Non-Projective Parsing for Statistical Machine Translation , 2009, EMNLP.

[45]  Spyridon Matsoukas,et al.  Effective Use of Linguistic and Contextual Information for Statistical Machine Translation , 2009, EMNLP.

[46]  Kevin Knight,et al.  Synchronous Tree Adjoining Machine Translation , 2009, EMNLP.

[47]  Matt Post,et al.  Syntax-based language models for statistical machine translation , 2010 .