An Empirical Account of Compositionality of Translation through Translation Data

Both in theoretical and applied research of machine translation it is often assumed that translation between natural languages can be treated in a compositional fashion, but it has proven far from trivial to develop a compositional translation system, or theoretically show it exists. In this thesis, an empirical investigation of compositionality of translation is presented, of which the main purpose is to find empirical evidence for the compositionality of actual translation data in the form of parallel corpora. All maximally compositional translation structures of sentences in parallel corpora aligned at the word level were studied, to gain information about the system that generated them. In particular, it was studied whether monolingual information from dependency parses could be the basis of this underlying system. Experiments showed that hardly over fifty percent of the dependency relations were preserved during translation if no modifications in the dependency relations were allowed. Considering deeper versions of dependency parses boosted this score with over thirty percentage points for all datasets. A manual analysis showed that most of the structure deviations were caused by errors in the data or systematic differences between the languages. The results are encouraging for pursuing development of compositional translation systems based on dependency parses. A proposal for doing so is presented in the discussion of this thesis. Tools to execute this proposal, as well as tools to conduct further empirical research, have been made available.

[1]  Philip Resnik,et al.  Evaluating Translational Correspondence using Annotation Projection , 2002, ACL.

[2]  W. J. Hutchins,et al.  The Georgetown-IBM experiment demonstrated in January 1954 , 2004, AMTA.

[3]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[4]  John Cocke,et al.  A statistical approach to French/English translation , 1988, TMI.

[5]  Dekang Lin,et al.  A Path-based Transfer Model for Machine Translation , 2004, COLING.

[6]  Chris Quirk,et al.  Dependency treelet translation: the convergence of statistical and example-based machine-translation? , 2006, MTSUMMIT.

[7]  Giorgio Satta,et al.  Factoring Synchronous Grammars by Sorting , 2006, ACL.

[8]  Khalil Sima'an,et al.  Phrase Translation Probabilities with ITG Priors and Smoothing as Learning Objective , 2008, EMNLP.

[9]  J. Landsbergen,et al.  The Power of Compositional Translation , 1989 .

[10]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[11]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[12]  Daniel Gildea,et al.  Extracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time , 2008, COLING.

[13]  Khalil Sima,et al.  Hierarchical Alignment Trees : A Recursive Factorization of Reordering in Word Alignments with Empirical Results , 2013 .

[14]  Ye-Yi Wang,et al.  Grammar Inference and Statistical Machine Translation , 2001 .

[15]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[16]  Arul Menezes,et al.  A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora , 2001, DDMMT@ACL.

[17]  Daniel Gildea,et al.  Synchronous Binarization for Machine Translation , 2006, NAACL.

[18]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[19]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[20]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[21]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[22]  Anders Søgaard Can inversion transduction grammars generate hand alignments , 2010, EAMT.

[23]  Theo M. V. Janssen,et al.  Algebraic Translations, Correctness and Algebraic Compiler Construction , 1998, Theor. Comput. Sci..

[24]  Phil Blunsom,et al.  Bayesian Synchronous Grammar Induction , 2008, NIPS.

[25]  Alexandra Birch,et al.  LRscore for Evaluating Lexical and Reordering Quality in MT , 2010, WMT@ACL.

[26]  Alaa A. Kharbouch,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[27]  Mark Steedman,et al.  Combinatory Categorial Grammar , 2011 .

[28]  Khalil Sima'an,et al.  Learning Hierarchical Translation Structure with Linguistic Annotations , 2011, ACL.

[29]  Michael A. Covington,et al.  A dependency parser for variable-word-order languages , 1990 .

[30]  F. J. Pelletier The Principle of Semantic Compositionality , 1994 .

[31]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[32]  Ted Pedersen,et al.  An Evaluation Exercise for Word Alignment , 2003, ParallelTexts@NAACL-HLT.

[33]  Dekai Wu,et al.  An Algorithm for Simultaneously Bracketing Parallel Texts by Aligning Words , 1995, ACL.

[34]  Andy Way,et al.  Supertagged Phrase-Based Statistical Machine Translation , 2007, ACL.

[35]  Chris Quirk,et al.  Machine Translation , 1972, HLT.

[36]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[37]  Giorgio Satta,et al.  Some Computational Complexity Results for Synchronous Context-Free Grammars , 2005, HLT/EMNLP.

[38]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[39]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[40]  Henk Zeevat,et al.  Constituentless Compositionality : A Compositional Account of Dependency Grammar , 2013 .

[41]  Joakim Nivre,et al.  Dependency Grammar and Dependency Parsing , 2005 .

[42]  Andy Way,et al.  CCG augmented hierarchical phrase-based machine translation , 2010, IWSLT.

[43]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[44]  Rens Bod,et al.  Unsupervised Parsing with U-DOP , 2006, CoNLL.

[45]  Christian Boitet,et al.  Implementation And Conversational Environment Of ARIANE 78.4, An Integrated System For Automated Translation And Human Revision , 1982, COLING.

[46]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[47]  José B. Mariño,et al.  Guidelines for Word Alignment Evaluation and Manual Alignment , 2005, Lang. Resour. Evaluation.

[48]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[49]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[50]  Daniel Gildea,et al.  Binarization of Synchronous Context-Free Grammars , 2009, CL.

[51]  I. Dan Melamed,et al.  Empirical Lower Bounds on the Complexity of Translational Equivalence , 2006, ACL.

[52]  Dekai Wu,et al.  MT model space: statistical versus compositional versus example-based machine translation , 2005, Machine Translation.

[53]  Arjen Poutsma Data-Oriented Translation , 2000, COLING.

[54]  Daniel Jurafsky,et al.  Parsing to Stanford Dependencies: Trade-offs between Speed and Accuracy , 2010, LREC.

[55]  András Kornai,et al.  Parallel corpora for medium density languages , 2007 .

[56]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[57]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[58]  Mirella Lapata,et al.  Optimal Constituent Alignment with Edge Covers for Semantic Projection , 2006, ACL.

[59]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[60]  Dekai Wu,et al.  Empirical lower bounds on translation unit error rate for the full class of inversion transduction grammars , 2009, IWPT.

[61]  Haim Gaifman,et al.  Dependency Systems and Phrase-Structure Systems , 1965, Inf. Control..

[62]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[63]  João Graça,et al.  Building a Golden Collection of Parallel Multi-Language Word Alignment , 2008, LREC.

[64]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[65]  Alfred V. Aho,et al.  Properties of Syntax Directed Translations , 1969, J. Comput. Syst. Sci..

[66]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[67]  Anders Søgaard,et al.  Empirical Lower Bounds on Aligment Error Rates in Syntax-Based Machine Translation , 2009, SSST@HLT-NAACL.

[68]  Harold L. Somers,et al.  Review Article: Example-based Machine Translation , 1999, Machine Translation.

[69]  Giorgio Satta,et al.  Generalized Multitext Grammars , 2004, ACL.

[70]  D. G. Hays Dependency Theory: A Formalism and Some Observations , 1964 .

[71]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[72]  Khalil Sima'an,et al.  Statistical Translation After Source Reordering: Oracles, Context-Aware Models, and Empirical Analysis , 2011, Natural Language Engineering.

[73]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[74]  Khalil Sima'an,et al.  Learning Probabilistic Synchronous CFGs for Phrase-Based Translation , 2010, CoNLL.

[75]  Jörg Tiedemann,et al.  Evaluation of Word Alignment Systems , 2000, LREC.

[76]  Srinivas Bangalore,et al.  Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[77]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[78]  Pascale Fung,et al.  Automatic Learning of Chinese English Semantic Structure Mapping , 2006, 2006 IEEE Spoken Language Technology Workshop.

[79]  Morten H. Christiansen,et al.  How hierarchical is language use? , 2012, Proceedings of the Royal Society B: Biological Sciences.