Dependency Graph-to-String Statistical Machine Translation

We present graph-based translation models which translate source graphs into target strings. Source graphs are constructed from dependency trees with extra links so that nonsyntactic phrases are connected. Inspired by phrase-based models, we first introduce a translation model which segments a graph into a sequence of disjoint subgraphs and generates a translation by combining subgraph translations left-to-right using beam search. However, similar to phrasebased models, this model is weak at phrase reordering. Therefore, we further introduce a model based on a synchronous node replacement grammar which learns recursive translation rules. We provide two implementations of the model with different restrictions so that source graphs can be parsed efficiently. Experiments on Chinese–English and German–English show that our graphbased models are significantly better than corresponding sequenceand tree-based baselines.

[1]  Qun Liu,et al.  A Dependency Treelet String Correspondence Model for Statistical Machine Translation , 2007, WMT@ACL.

[2]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[3]  Alon Lavie,et al.  Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[4]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[5]  Clemens Lautemann The complexity of graph languages generated by hyperedge replacement , 2004, Acta Informatica.

[6]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[7]  Qun Liu,et al.  Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation , 2006, ACL.

[8]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[9]  Alon Lavie,et al.  Decoding with Syntactic and Non-Syntactic Phrases in a Syntax-Based Machine Translation System , 2009, SSST@HLT-NAACL.

[10]  Christopher D. Manning,et al.  Optimizing Chinese Word Segmentation for Machine Translation Performance , 2008, WMT@ACL.

[11]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[12]  Joshua Goodman,et al.  Semiring Parsing , 1999, CL.

[13]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[14]  Jinxi Xu,et al.  String-to-Dependency Statistical Machine Translation , 2010, CL.

[15]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[16]  Andy Way,et al.  Transformation and Decomposition for Efficiently Implementing and Improving Dependency-to-String Model In Moses , 2014, SSST@EMNLP.

[17]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[18]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[19]  Jennifer Foster,et al.  A Detailed Analysis of Phrase-based and Syntax-based MT: The Search for Systematic Differences , 2012, AMTA.

[20]  Andy Way,et al.  Graph-Based Translation Via Graph Segmentation , 2016, ACL.

[21]  Liang Huang,et al.  A Syntax-Directed Translator with Extended Domain of Locality , 2006 .

[22]  Liangyou Li,et al.  Dependency graph-based statistical machine translation , 2016 .

[23]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[24]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[25]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[26]  Joakim Nivre,et al.  Pseudo-Projective Dependency Parsing , 2005, ACL.

[27]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[28]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[29]  Ai Ti Aw,et al.  A tree-to-tree alignment-based model for statistical machine translation , 2007, MTSUMMIT.

[30]  Qun Liu,et al.  A Dependency Edge-based Transfer Model for Statistical Machine Translation , 2014, COLING.

[31]  Qun Liu,et al.  A novel dependency-to-string model for statistical machine translation , 2011, EMNLP.

[32]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[33]  Stuart M. Shieber,et al.  Principles and Implementation of Deductive Parsing , 1994, J. Log. Program..

[34]  Induction of Probabilistic Synchronous Tree-Insertion Grammars for Machine Translation , 2006, AMTA.

[35]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[36]  Andy Way,et al.  CCG Contextual labels in Hierarchical Phrase-Based SMT , 2011, EAMT.

[37]  Yang Liu,et al.  Tree-to-String Alignment Template for Statistical Machine Translation , 2006, ACL.

[38]  Andy Way,et al.  Dependency Graph-to-String Translation , 2015, EMNLP.

[39]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[40]  Haitao Mi,et al.  Efficient Incremental Decoding for Tree-to-String Translation , 2010, EMNLP.

[41]  Dekang Lin,et al.  A Path-based Transfer Model for Machine Translation , 2004, COLING.

[42]  Qun Liu,et al.  Translation with Source Constituency and Dependency Trees , 2013, EMNLP.

[43]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[44]  Daniel Marcu,et al.  Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy , 2007, EMNLP.

[45]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[46]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[47]  Jacob Andreas,et al.  Parsing Graphs with Hyperedge Replacement Grammars , 2013, ACL.

[48]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[49]  J. Cocke,et al.  A Statistical Approach to Machine , 1990 .

[50]  Colin Cherry Improved Reordering for Phrase-Based Translation using Sparse Features , 2013, HLT-NAACL.

[51]  P. J. Williams,et al.  Unification-based constraints for statistical machine translation , 2014 .

[52]  Chris Quirk,et al.  Dependency treelet translation: the convergence of statistical and example-based machine-translation? , 2006, MTSUMMIT.

[53]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[54]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[55]  松本 裕治,et al.  Abstract Meaning Representationを用いた名詞句の意味構造解析 , 2015 .

[56]  Qun Liu,et al.  Augment Dependency-to-String Translation with Fixed and Floating Structures , 2014, COLING.

[57]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[58]  Lawrence B. Holder,et al.  Inference of node and edge replacement graph grammars , 2007 .

[59]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[60]  Philipp Koehn,et al.  Augmenting String-to-Tree and Tree-to-String Translation with Non-Syntactic Phrases , 2014, WMT@ACL.

[61]  Daniel Gildea,et al.  Synchronous Binarization for Machine Translation , 2006, NAACL.

[62]  Josef van Genabith,et al.  Head-Driven Hierarchical Phrase-based Translation , 2012, ACL.

[63]  Christopher D. Manning,et al.  Accurate Non-Hierarchical Phrase-Based Translation , 2010, NAACL.

[64]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[65]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[66]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[67]  Hala Almaghout,et al.  Extending CCG-based Syntactic Constraints in Hierarchical Phrase-Based SMT , 2012, EAMT.

[68]  Philipp Koehn,et al.  Explorer Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation , 2005 .

[69]  Stephan Vogel,et al.  A Word-Class Approach to Labeling PSCFG Rules for Machine Translation , 2011, ACL.

[70]  Jacob Andreas,et al.  Semantics-Based Machine Translation with Hyperedge Replacement Grammars , 2012, COLING.

[71]  Liang Huang,et al.  Statistical Syntax-Directed Translation with Extended Domain of Locality , 2006, AMTA.

[72]  Eva M. Duran Eppler Dependency Distance and Bilingual Language Use: Evidence from German/English and Chinese/English Data , 2013, DepLing.

[73]  Yvette Graham,et al.  Deep Syntax in Statistical Machine Translation , 2011 .