CCG-based Models for Statistical Machine Translation

The arguably best performing statistical machine translation systems are based on context-free formalisms or weakly equivalent ones. These models usually use a synchronous version of a context-free grammar (SCFG) which we argue is too rigid for the highly ambiguous task of human language translation. This is exacerbated by the fact that the imperfect methods available for aligning parallel texts make extracting an efficient grammar very hard. As a result, the context-free grammars extracted are usually very large in size after having already been restricted through a variety of constraints. We propose to use Combinatorial Categorial Grammar (CCG) for machine translation models. CCG is a lexicalized, mildly-context-sensitive formalism which is very well suited to capture long-distance dependencies that are not addressed very well by most current models. We believe that CCG is very well suited for the task of machine translation due to its ability to represent non-constituents in a syntactic way which frequently occur in parallel texts as well as its high derivational flexibility. This allows us to use some of the advantages of non-syntactic phrase-based approaches within a syntactic framework such as a relatively small grammar size compared to context-freebased machine translation grammars. A number of models leveraging the advantages of CCG are possible, however, our principal goal is to develop a string-to-tree based model which projects CCG on the target side of a synchronous grammar. We intend to apply the vast progress made in monolingual CCG parsing to machine translation. Additionally, we propose to extend CCG to a synchronous grammar (SCCG) as it has been done for other related formalisms such as tree adjoining grammars. We hope that a SCCG may provide similar derivational flexibility to monolingual CCG which may result in a better model for translational equivalence.

[1]  David J. Weir,et al.  The convergence of mildly context-sensitive grammar formalisms , 1990 .

[2]  Chris Quirk,et al.  Machine Translation , 1972, HLT.

[3]  Induction of Probabilistic Synchronous Tree-Insertion Grammars for Machine Translation , 2006, AMTA.

[4]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[5]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[6]  Yang Liu,et al.  Tree-to-String Alignment Template for Statistical Machine Translation , 2006, ACL.

[7]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[8]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[9]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Statistical Machine Translation , 2007, ACL.

[10]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[11]  Michael Collins,et al.  A Discriminative Model for Tree-to-Tree Translation , 2006, EMNLP.

[12]  Andy Way,et al.  Supertagged Phrase-Based Statistical Machine Translation , 2007, ACL.

[13]  Jason Eisner Efficient Normal-Form Parsing for Combinatory Categorial Grammar , 1996, ACL.

[14]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[15]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[16]  I. Dan Melamed,et al.  Empirical Lower Bounds on the Complexity of Translational Equivalence , 2006, ACL.

[17]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[18]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[19]  Haitao Mi,et al.  Forest-based Translation Rule Extraction , 2008, EMNLP.

[20]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[21]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[22]  Giorgio Satta,et al.  Synchronous Models of Language , 1996, ACL.

[23]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[24]  Stuart M. Shieber,et al.  Synchronous Tree-Adjoining Grammars , 1990, COLING.

[25]  Mark Steedman,et al.  Generative Models for Statistical Parsing with Combinatory Categorial Grammar , 2002, ACL.

[26]  Qun Liu,et al.  Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation , 2006, ACL.

[27]  James R. Curran,et al.  Perceptron Training for a Wide-Coverage Lexicalized-Grammar Parser , 2007, ACL 2007.

[28]  David Weimer Bibliography , 2018, Medical History. Supplement.

[29]  Julia Hockenmaier,et al.  Creating a CCGbank and a Wide-Coverage CCG Lexicon for German , 2006, ACL.

[30]  Daniel Gildea,et al.  Efficient Multi-Pass Decoding for Synchronous Context Free Grammars , 2008, ACL.

[31]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[32]  Aravind K. Joshi,et al.  Domains of locality , 2004, Data Knowl. Eng..

[33]  Mark Steedman,et al.  Object-Extraction and Question-Parsing using CCG , 2004, EMNLP.

[34]  Ben Taskar,et al.  Max-Margin Parsing , 2004, EMNLP.

[35]  Julia Hockenmaier Parsing with Generative Models of Predicate-Argument Structure , 2003, ACL.

[36]  Brian Roark,et al.  Probabilistic Top-Down Parsing and Language Modeling , 2001, CL.

[37]  Stephan Vogel,et al.  An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT , 2007, NAACL.

[38]  Daniel Gildea,et al.  Synchronous Binarization for Machine Translation , 2006, NAACL.

[39]  Chris Quirk,et al.  Using Dependency Order Templates to Improve Generality in Translation , 2007, WMT@ACL.

[40]  Philipp Koehn,et al.  A Systematic Analysis of Translation Model Search Spaces , 2009, WMT@EACL.

[41]  Chao Wang,et al.  Chinese Syntactic Reordering for Statistical Machine Translation , 2007, EMNLP.

[42]  Liang Huang,et al.  A Syntax-Directed Translator with Extended Domain of Locality , 2006 .

[43]  Franz Josef Och,et al.  A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT , 2008, COLING.

[44]  Hany Hassan,et al.  Lexical syntax for statistical machine translation , 2009 .

[45]  Haizhou Li,et al.  A Tree Sequence Alignment-based Tree-to-Tree Translation Model , 2008, ACL.

[46]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[47]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[48]  Hermann Ney,et al.  Discriminative Reordering Models for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[49]  Andy Way,et al.  A syntactic language model based on incremental CCG parsing , 2008, 2008 IEEE Spoken Language Technology Workshop.

[50]  Srinivas Bangalore,et al.  Bootstrapping A Wide-Coverage CCG from FB-LTAG , 1994, ArXiv.

[51]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[52]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[53]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[54]  Philipp Koehn,et al.  Enriching Morphologically Poor Languages for Statistical Machine Translation , 2008, ACL.

[55]  Stefan Riezler,et al.  Grammatical Machine Translation , 2006, NAACL.

[56]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[57]  Noah A. Smith,et al.  Rich Source-Side Context for Statistical Machine Translation , 2008, WMT@ACL.

[58]  Peng Xu,et al.  A Study on Richer Syntactic Dependencies for Structured Language Modeling , 2002, ACL.

[59]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.

[60]  Yves Schabes,et al.  Stochastic Lexicalized Tree-adjoining Grammars , 1992, COLING.

[61]  Daniel Gildea Parsers as language models for statistical machine translation , 2008 .

[62]  Kenji Yamada,et al.  Syntax-based language models for statistical machine translation , 2003, ACL 2003.

[63]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[64]  Dekai Wu,et al.  A Polynomial-Time Algorithm for Statistical Machine Translation , 1996, ACL.

[65]  Adam Lopez,et al.  Hierarchical Phrase-Based Translation with Suffix Arrays , 2007, EMNLP.

[66]  Alfred V. Aho,et al.  Syntax Directed Translations and the Pushdown Assembler , 1969, J. Comput. Syst. Sci..

[67]  Daniel Marcu,et al.  SPMT: Statistical Machine Translation with Syntactified Target Language Phrases , 2006, EMNLP.

[68]  Heidi Fox,et al.  Phrasal Cohesion and Statistical Machine Translation , 2002, EMNLP.

[69]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[70]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[71]  Phil Blunsom,et al.  Probabilistic Inference for Machine Translation , 2008, EMNLP.

[72]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[73]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[74]  Phil Blunsom,et al.  A Discriminative Latent Variable Model for Statistical Machine Translation , 2008, ACL.

[75]  Alexis Nasr,et al.  SuperTagging and Full Parsing , 2004, TAG+.

[76]  Syntax Augmented Machine Translation via Chart Parsing with Integrated Language Modeling , 2006 .

[77]  Giorgio Satta,et al.  Generalized Multitext Grammars , 2004, ACL.

[78]  Stuart M. Shieber,et al.  Probabilistic Synchronous Tree-Adjoining Grammars for Machine Translation: The Argument from Bilingual Dictionaries , 2007, SSST@HLT-NAACL.

[79]  Philipp Koehn,et al.  CCG Supertags in Factored Statistical Machine Translation , 2007, WMT@ACL.

[80]  Chris Callison-Burch,et al.  Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation , 2009, ACL.

[81]  Julia Hockenmaier,et al.  Statistical Parsing for CCG with Simple Generative Models , 2001, ACL.

[82]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.