Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

We present a new translation model integrating the shallow local multi bottomup tree transducer. We perform a largescale empirical evaluation of our obtained system, which demonstrates that we significantly beat a realistic tree-to-tree baseline on the WMT 2009 English! German translation task. As an additional contribution we make the developed software and complete tool-chain publicly available for further experimentation.

[1]  Andreas Maletti,et al.  Why Synchronous Tree Substitution Grammars? , 2010, NAACL.

[2]  Philipp Koehn,et al.  A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation , 2009, IWSLT.

[3]  Helmut Schmid Efficient Parsing of Highly Ambiguous Context-Free Grammars with Bit Vectors , 2004, COLING.

[4]  Andreas Maletti,et al.  Every sensible extended top-down tree transducer is a multi bottom-up tree transducer , 2012, NAACL.

[5]  Jun Sun,et al.  A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation , 2009, ACL/IJCNLP.

[6]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[7]  Max Dauchet,et al.  Morphismes et Bimorphismes d'Arbres , 1982, Theor. Comput. Sci..

[8]  Alon Lavie,et al.  Syntax-Driven Learning of Sub-Sentential Translation Equivalents and Translation Rules from Parsed Parallel Corpora , 2008, SSST@ACL.

[9]  Yang Liu,et al.  Tree-to-String Alignment Template for Statistical Machine Translation , 2006, ACL.

[10]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[11]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[12]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[13]  Jean-Claude Raoult Rational tree relations , 1997 .

[14]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[15]  Liang Huang,et al.  Statistical Syntax-Directed Translation with Extended Domain of Locality , 2006, AMTA.

[16]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[17]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[18]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[19]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[20]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[21]  Andreas Maletti How to train your multi bottom-up tree transducer , 2011, ACL.

[22]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[23]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[24]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[25]  Philipp Koehn,et al.  Findings of the 2009 Workshop on Statistical Machine Translation , 2009, WMT@EACL.

[26]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[27]  Yang Liu,et al.  Improving Tree-to-Tree Translation with Packed Forests , 2009, ACL.

[28]  Haizhou Li,et al.  A Tree Sequence Alignment-based Tree-to-Tree Translation Model , 2008, ACL.

[29]  Philipp Koehn,et al.  Explorer Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation , 2005 .

[30]  Haizhou Li,et al.  Grammar Comparison Study for Translational Equivalence Modeling and Statistical Machine Translation , 2008, COLING.

[31]  David Chiang,et al.  Learning to Translate with Source and Target Syntax , 2010, ACL.