Meta-Structure Transformation Model for Statistical Machine Translation

We propose a novel syntax-based model for statistical machine translation in which meta-structure (MS) and meta-structure sequence (SMS) of a parse tree are defined. In this framework, a parse tree is decomposed into SMS to deal with the structure divergence and the alignment can be reconstructed at different levels of recombination of MS (RM). RM pairs extracted can perform the mapping between the sub-structures across languages. As a result, we have got not only the translation for the target language, but an SMS of its parse tree at the same time. Experiments with BLEU metric show that the model significantly outperforms Pharaoh, a state-art-the-art phrase-based system.

[1]  Srinivas Bangalore,et al.  Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[2]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[3]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[4]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[5]  Francisco Casacuberta,et al.  Machine Translation with Inferred Stochastic Finite-State Transducers , 2004, Computational Linguistics.

[6]  I. Dan Melamed,et al.  Automatic Evaluation and Uniform Filter Cascades for Inducing N-Best Translation Lexicons , 1995, VLC@ACL.

[7]  Daniel Gildea,et al.  Loosely Tree-Based Alignment for Machine Translation , 2003, ACL.

[8]  Hermann Ney,et al.  Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information , 2004, CL.

[9]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[10]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[11]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[12]  Christoph Tillmann,et al.  A Projection Extension Algorithm for Statistical Machine Translation , 2003, EMNLP.

[13]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[14]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[15]  Kevin Knight,et al.  Training Tree Transducers , 2004, NAACL.

[16]  Regina Barzilay,et al.  Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[19]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.