Compositions of Tree-to-Tree Statistical Machine Translation Models

Compositions of well-known tree-to-tree translation models used in statistical machine translation are investigated. Synchronous context-free grammars are closed under composition in both the unweighted as well as the weighted case. In addition, it is demonstrated that there is a close connection between compositions of synchronous tree-substitution grammars and compositions of certain tree transducers because the intermediate trees can encode finite-state information. Utilizing these close ties, the composition closure of synchronous tree-substitution grammars is identified in the unweighted and weighted case. In particular, in the weighted case, these results build on a novel lifting strategy that will prove useful also in other setups.

[1]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[2]  Anoop Sarkar,et al.  Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction , 2011, ACL.

[3]  Zoltán Fülöp,et al.  Weighted Tree Transducers , 2004, J. Autom. Lang. Comb..

[4]  J. Golan Semirings and their applications , 1999 .

[5]  Shaoyu Chen,et al.  Translation of Quantifiers in Japanese-Chinese Machine Translation , 2012, JapTAL.

[6]  Werner Kuich,et al.  Full Abstract Families of Tree Series I , 1999, Jewels are Forever.

[7]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[8]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[9]  Fei Xia,et al.  Improving a Statistical MT System with Automatically Learned Rewrite Patterns , 2004, COLING.

[10]  Fabienne Braune,et al.  A systematic evaluation of MBOT in statistical machine translation , 2015, MTSUMMIT.

[11]  Peter Lammich,et al.  Tree Automata , 2009, Arch. Formal Proofs.

[12]  Fabienne Braune,et al.  String-to-Tree Multi Bottom-up Tree Transducers , 2015, ACL.

[13]  Joost Engelfriet,et al.  Bottom-up and top-down tree transformations— a comparison , 1975, Mathematical systems theory.

[14]  H. Vogler,et al.  Weighted Tree Automata and Tree Transducers , 2009 .

[15]  Brenda S. Baker,et al.  Composition of Top-Down and Bottom-Up Tree Transductions , 1979, Inf. Control..

[16]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[17]  U. Hebisch,et al.  Semirings: Algebraic Theory and Applications in Computer Science , 1998 .

[18]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[19]  Kevin Knight,et al.  Training Tree Transducers , 2004, NAACL.

[20]  Sara Stymne,et al.  Text Harmonization Strategies for Phrase-Based Statistical Machine Translation , 2012 .

[21]  Mark Hopkins,et al.  The Power of Extended Top-Down Tree Transducers , 2009, SIAM J. Comput..

[22]  Joost Engelfriet,et al.  Composition Closure of Linear Extended Top-down Tree Transducers , 2013, Theory of Computing Systems.

[23]  Andreas Maletti,et al.  The Power of Weighted Regularity-Preserving Multi Bottom-Up Tree Transducers , 2015, Int. J. Found. Comput. Sci..

[24]  Max Dauchet,et al.  Morphismes et Bimorphismes d'Arbres , 1982, Theor. Comput. Sci..

[25]  Slav Petrov,et al.  Source-Side Classifier Preordering for Machine Translation , 2013, EMNLP.

[26]  Heiko Vogler,et al.  Efficient Inference through Cascades of Weighted Tree Transducers , 2010, ACL.

[27]  Zoltán Fülöp,et al.  Weighted Extended Tree Transducers , 2011, Fundam. Informaticae.

[28]  Alfred V. Aho,et al.  Syntax Directed Translations and the Pushdown Assembler , 1969, J. Comput. Syst. Sci..

[29]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.