Syntax-Based Statistical Machine Translation : A review

Ever since the incipient of computers and the very first introduction of artificial intelligence, machine translation has been a target goal — or better said, a dream that at some point in the past deemed impossible (ALPAC 1966). The problem that machine translation aims to solve is very simple: given a document/sentence in a source language, produce its equivalent in the target language. This problem is complicated because of the inherent ambiguity of languages: the same word can have different meaning based on the context, idioms plus many other computational factors. Moreover extra domain knowledge is needed for a high quality output. Early techniques to solve this problem were human-intensive via parsing, transfer rules and generation with the help of an Interlingua (Hutchins 1995). This approach, while performing well in restricted domains, is not scalable and not suitable for languages that we do not have a syntactic theory/parser for. In the last decade, statistical techniques using the noisy channel model dominated the field and outperformed classical ones (Brown et al. 1993), however one problem with statistical methods is that they do not employ enough linguistic-theory to produce a grammatically coherent output(Och et al. 2003). This is because these methods incorporate little or no explicit syntactical theory and it only captures elements of syntax implicitly via the use of an n-gram language model in the noisy channel framework, which ca not model long dependencies. The goal of syntax-based machine translation techniques is to incorporate an explicit representation of syntax into the statistical systems to get the best out of the two worlds: high quality output while not requiring intensive human efforts. In this report we will give an overview of various approaches for syntax-aware statistical machine translation systems developed,or proposed, in the lase two decades. In our survey, we will stress the tension between the expressivity of the model and the complexity of its associated training and decoding procedures. The rest of this report is organized as follows: first, Section 2, gives a brief overview of the basic statistical machine translation model that serves as the basis of the subsequent discussions, and motivates the need for deploying syntax in the translation pipeline. In Section 3, we discuss various formal grammar formalisms which were proposed to model parallel texts. Then in section 4, we describe how these theoretical ideas have been used to augment the basic models in Section 2, and detail how the resulting models are trained from data, as well as assessing their complexity against the extra accuracy gained. Finally we conclude in Section 5

[1]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[2]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[3]  William J. Byrne,et al.  HMM Word and Phrase Alignment for Statistical Machine Translation , 2005, HLT.

[4]  Hermann Ney,et al.  Statistical Methods for Machine Translation , 2000 .

[5]  Daniel Gildea,et al.  Synchronous Binarization for Machine Translation , 2006, NAACL.

[6]  Anoop Sarkar,et al.  Discriminative Reranking for Machine Translation , 2004, NAACL.

[7]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[8]  Aravind K. Joshi,et al.  Tree-Adjoining Grammars , 1997, Handbook of Formal Languages.

[9]  Michael Collins,et al.  A Discriminative Model for Tree-to-Tree Translation , 2006, EMNLP.

[10]  Daniel Gildea,et al.  Loosely Tree-Based Alignment for Machine Translation , 2003, ACL.

[11]  Aravind K. Joshi,et al.  Using Lexicalized Tags for Machine Translation , 1990, COLING.

[12]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[13]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[14]  Daniel Marcu,et al.  SPMT: Statistical Machine Translation with Syntactified Target Language Phrases , 2006, EMNLP.

[15]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[16]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[17]  Alexander M. Fraser,et al.  Syntax for Statistical Machine Translation , 2003 .

[18]  Giorgio Satta,et al.  Generalized Multitext Grammars , 2004, ACL.

[19]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[20]  I. Dan Melamed,et al.  Scalable Discriminative Learning for Natural Language Parsing and Translation , 2006, NIPS.

[21]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[22]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[23]  Nitin Madnani,et al.  The Hiero Machine Translation System: Extensions, Evaluation, and Analysis , 2005, HLT.

[24]  Stuart M. Shieber,et al.  Synchronous Tree-Adjoining Grammars , 1990, COLING.

[25]  Liang Huang,et al.  Statistical Syntax-Directed Translation with Extended Domain of Locality , 2006, AMTA.

[26]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[27]  Induction of Probabilistic Synchronous Tree-Insertion Grammars for Machine Translation , 2006, AMTA.

[28]  Kevin Knight,et al.  Training Tree Transducers , 2004, NAACL.

[29]  José B. Mariño,et al.  Morpho-syntactic Information for Automatic Error Analysis of Statistical Machine Translation Output , 2006, WMT@HLT-NAACL.

[30]  Ding Liu,et al.  Syntactic Features for Evaluation of Machine Translation , 2005, IEEvaluation@ACL.

[31]  Daniel Gildea Dependencies vs. Constituents for Tree-Based Alignment , 2004, EMNLP.

[32]  Daniel Gildea,et al.  Syntax-Based Alignment: Supervised or Unsupervised? , 2004, COLING.

[33]  Daniel Gildea,et al.  Inducing Word Alignments with Bilexical Synchronous Trees , 2006, ACL.

[34]  W. J. Hutchins,et al.  Machine Translation: A Brief History , 1995 .

[35]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[36]  I. Dan Melamed,et al.  Multitext Grammars and Synchronous Parsers , 2003, NAACL.

[37]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[38]  I. D. Melamed Algorithms for syntax-aware statistical machine translation , 2004, TMI.

[39]  Philippe Langlais,et al.  Phrase-Based SMT with Shallow Tree-Phrases , 2006, WMT@HLT-NAACL.

[40]  Kevin Knight,et al.  A Decoder for Syntax-based Statistical MT , 2002, ACL.

[41]  Michael T. Ward Concise history of the language sciences: From the Sumerians to the cognitivists , 1997 .

[42]  Chris Quirk,et al.  The impact of parse quality on syntactically-informed statistical machine translation , 2006, EMNLP.

[43]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[44]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[45]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[46]  David A. Smith,et al.  Quasi-Synchronous Grammars: Alignment by Soft Projection of Syntactic Dependencies , 2006, WMT@HLT-NAACL.

[47]  John R. Pierce,et al.  Language and Machines: Computers in Translation and Linguistics , 1966 .

[48]  Daniel Gildea,et al.  Stochastic Lexicalized Inversion Transduction Grammar for Alignment , 2005, ACL.

[49]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[50]  David Chiang,et al.  An Introduction to Synchronous Grammars , 2006 .