Induction of Probabilistic Synchronous Tree-Insertion Grammars

Draft. Comments weolcomed. Please do not cite or quote without prior consent. Revision 1.9 of August 15, 2005, 15:45:21, generated August 15, 2005. Increasingly, researchers developing statistical machine translation systems have moved to incorporate syntactic structure in the models that they induce. These researchers are motivated by the intuition that the limitations in the finite-state translation models exemplified by IBM’s “Model 5” follow from the inability to use phrasal and hierarchical information in the interlingual mapping. What is desired is a formalism that has the substitution-based hierarchical structure provided by context-free grammars, with the lexical relationship potential of n-gram models, with processing efficiency no worse than CFGs. Further, it should ideally allow for discontinuity in phrases, and be synchronizable, to allow for multilinguality. Finally, in order to support automated induction, it should allow for a probabilistic variant. We introduce probabilistic synchronous tree-insertion grammars (PSTIG) as such a formalism. In this paper, we define a restricted version of PSTIG, and provide algorithms for parsing, parameter estimation, and translation. As a proof of concept, we successfully apply these algorithms to a toy problem, corpus-based induction of a statistical translator of arithmetic expressions from postfix to partially parenthesized infix.

[1]  S. Shieber,et al.  40 40 08 v 1 2 6 A pr 1 99 4 Principles and Implementation of Deductive Parsing , .

[2]  Stuart M. Shieber,et al.  Synchronous Tree-Adjoining Grammars , 1990, COLING.

[3]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[4]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[5]  Detlef Prescher,et al.  Inside-Outside Estimation Meets Dynamic EM , 2004, IWPT.

[6]  Philip Resnik,et al.  Evaluating Translational Correspondence using Annotation Projection , 2002, ACL.

[7]  Richard C. Waters,et al.  Lexicalized Context-Free Grammars , 1993, ACL.

[8]  Philip Resnik,et al.  Bootstrapping parsers via syntactic projection across parallel texts , 2005, Natural Language Engineering.

[9]  Steve Young,et al.  Applications of stochastic context-free grammars using the Inside-Outside algorithm , 1990 .

[10]  Kevin Knight,et al.  A Decoder for Syntax-based Statistical MT , 2002, ACL.

[11]  Dekai Wu,et al.  A Polynomial-Time Algorithm for Statistical Machine Translation , 1996, ACL.

[12]  J. Baker Trainable grammars for speech recognition , 1979 .

[13]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[14]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[15]  Wei Wang,et al.  Statistical Machine Translation by Generalized Parsing , 2004, cs/0407005.

[16]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[17]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[18]  Srinivas Bangalore,et al.  Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[19]  Daniel Gildea,et al.  Loosely Tree-Based Alignment for Machine Translation , 2003, ACL.

[20]  Alfred V. Aho,et al.  Syntax Directed Translations and the Pushdown Assembler , 1969, J. Comput. Syst. Sci..

[21]  Richard Edwin Stearns,et al.  Syntax-Directed Transduction , 1966, JACM.

[22]  WuDekai Stochastic inversion transduction grammars and bilingual parsing of parallel corpora , 1997 .

[23]  STUART M. SHIEBER RESTRICTING THE WEAK‐GENERATIVE CAPACITY OF SYNCHRONOUS TREE‐ADJOINING GRAMMARS , 1994, Comput. Intell..

[24]  I. Dan Melamed,et al.  Multitext Grammars and Synchronous Parsers , 2003, NAACL.

[25]  Aravind K. Joshi,et al.  Parsing Strategies with ‘Lexicalized’ Grammars: Application to Tree Adjoining Grammars , 1988, COLING.

[26]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[27]  Richard C. Waters,et al.  Tree Insertion Grammar: A Cubic-Time, Parsable Formalism that Lexicalizes Context-Free Grammar without Changing the Trees Produced , 1995, CL.

[28]  Joshua Goodman,et al.  Semiring Parsing , 1999, CL.

[29]  Rebecca Hwa,et al.  Learning probabilistic lexicalized grammars for natural language processing , 2001 .