Bottom-up Transfer in Example-based Machine Translation

This paper describes the transfer component of a syntax-based Example-based Machine Translation system. The source sentence parse tree is matched in a bottom-up fashion with the source language side of a parallel example treebank, which results in a target forest which is sent to the target language generation component. The results on a 500 sentences test set are compared with a top-down approach to transfer of the same system, with the bottom-up approach yielding much better results.

[1]  Jörg Tiedemann,et al.  Building a Large Machine-Aligned Parallel Treebank , 2009 .

[2]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[3]  Arjen Poutsma Data-Oriented Translation , 2000, COLING.

[4]  Alon Lavie,et al.  Extraction of Syntactic Translation Models from Parallel Data using Syntax from Source and Target Languages , 2009 .

[5]  Vincent Vandeghinste,et al.  Tree-Based Target Language Modeling , 2009, EAMT.

[6]  Kevin Knight,et al.  An Overview of Probabilistic Tree Transducers for Natural Language Processing , 2005, CICLing.

[7]  Akio Fujiyoshi Epsilon-Free Grammars and Lexicalized Grammars that Generate the Class of the Mildly Context-Sensitive Languages , 2004, TAG+.

[8]  Peter Sanders,et al.  Simple Linear Work Suffix Array Construction , 2003, ICALP.

[9]  James W. Thatcher,et al.  Characterizing Derivation Trees of Context-Free Grammars through a Generalization of Finite Automata Theory , 1967, J. Comput. Syst. Sci..

[10]  Peter Weiner,et al.  Linear Pattern Matching Algorithms , 1973, SWAT.

[11]  Enno Ohlebusch,et al.  Replacing suffix trees with enhanced suffix arrays , 2004, J. Discrete Algorithms.

[12]  Uwe Mönnich Adjunction As Substitution: An Algebraic Formulation of Regular Context-Free and Tree Adjoining Languages , 1997, ArXiv.

[13]  Rens Bod,et al.  A Computational Model of Language Performance: Data Oriented Parsing , 1992, COLING.

[14]  Ai Ti Aw,et al.  A tree-to-tree alignment-based model for statistical machine translation , 2007, MTSUMMIT.

[15]  Gertjan van Noord,et al.  At Last Parsing Is Now Operational , 2006, JEPTALNRECITAL.

[16]  Scott Martens,et al.  Quantitative analysis of treebanks using frequent subtree mining methods , 2009, Graph-based Methods for Natural Language Processing.

[17]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[18]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[19]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[20]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[21]  Mary Hearne,et al.  Data-oriented models of parsing and translation , 2005 .

[22]  Yun Chi,et al.  Frequent Subtree Mining - An Overview , 2004, Fundam. Informaticae.

[23]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[24]  C. Q. Lee,et al.  The Computer Journal , 1958, Nature.

[25]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[26]  Alfred V. Aho,et al.  Syntax Directed Translations and the Pushdown Assembler , 1969, J. Comput. Syst. Sci..

[27]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[28]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[29]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[30]  Vincent Vandeghinste,et al.  Top-down Transfer in Example-based MT , 2009 .

[31]  Dekai Wu,et al.  Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009 , 2009, SSST@HLT-NAACL.

[32]  William C. Rounds,et al.  Tree-oriented proofs of some theorems on context-free and indexed languages , 1970, STOC.

[33]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[34]  中澤 敏明 Fully syntactic example-based machine translation , 2010 .

[35]  F. Luccio,et al.  BOTTOM-UP SUBTREE ISOMORPHISM FOR UNORDERED LABELED TREES , 2004 .

[36]  Jörg Tiedemann,et al.  A Discriminative Approach to Tree Alignment , 2009 .