Tree Insertion Grammar: A Cubic-Time, Parsable Formalism that Lexicalizes Context-Free Grammar without Changing the Trees Produced

Tree insertion grammar (TIG) is a tree-based formalism that makes use of tree substitution and tree adjunction. TIG is related to tree adjoining grammar. However, the adjunction permitted in TIG is sufficiently restricted that TIGs only derive context-free languages and TIGs have the same cubic-time worst-case complexity bounds for recognition and parsing as context-free grammars. An efficient Earley-style parser for TIGs is presented.Any context-free grammar (CFG) can be converted into a lexicalized tree insertion grammar (LTIG) that generates the same trees. A constructive procedure is presented for converting a CFG into a left anchored (i.e., word initial) LTIG that preserves ambiguity and generates the same trees. The LTIG created can be represented compactly by taking advantage of sharing between the elementary trees in it. Methods of converting CFGs into left anchored CFGs, e.g., the methods of Greibach and Rosenkrantz, do not preserve the trees produced and result in very large output grammars.For the purpose of experimental evaluation, the LTIG lexicalization procedure was applied to eight different CFGs for subsets of English. The LTIGs created were smaller than the original CFGs. Using an implementation of the Earley-style TIG parser that was specialized for left anchored LTIGs, it was possible to parse more quickly with the LTIGs than with the original CFGs.

[1]  Yves Schabes The Valid Prefix Property and Left to Right Parsing of Tree-Adjoining Grammar , 1991, IWPT.

[2]  Andreas Podelski,et al.  Tree Automata and Languages , 1992 .

[3]  James W. Thatcher,et al.  Characterizing Derivation Trees of Context-Free Grammars through a Generalization of Finite Automata Theory , 1967, J. Comput. Syst. Sci..

[4]  Aravind K. Joshi,et al.  Tree-adjoining grammars and lexicalized grammars , 1992, Tree Automata and Languages.

[5]  David J. Weir,et al.  Tree Adjoining and Head Wrapping , 1986, COLING.

[6]  Hidetomo Ichihashi,et al.  Possibilistic Linear Programming with Measurable Multiattribute Value Functions , 1989, INFORMS J. Comput..

[7]  William T. Freeman,et al.  An animated on-line community with artificial agents , 1994, IEEE MultiMedia.

[8]  Michael A. Arbib,et al.  An Introduction to Formal Language Theory , 1988, Texts and Monographs in Computer Science.

[9]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[10]  Aravind K. Joshi,et al.  Parsing Strategies with ‘Lexicalized’ Grammars: Application to Tree Adjoining Grammars , 1988, COLING.

[11]  Anne Abeillé,et al.  A Lexicalized Tree Adjoining Grammar for English , 1990 .

[12]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[13]  Aravind K. Joshi,et al.  A study of tree adjoining grammars , 1987 .

[14]  David J. Weir,et al.  Characterizing mildly context-sensitive grammar formalisms , 1988 .

[15]  A. V. Yazenin,et al.  Fuzzy and stochastic programming , 1987 .

[16]  S. Kataoka A Stochastic Programming Model , 1963 .

[17]  Stuart M. Shieber,et al.  Principles and Implementation of Deductive Parsing , 1994, J. Log. Program..

[18]  H. Zimmermann Fuzzy programming and linear programming with several objective functions , 1978 .

[19]  H. Ishibuchi,et al.  Identification of possibilistic linear systems by quadratic membership functions of fuzzy parameters , 1990 .

[20]  Eric Brill,et al.  Deducing linguistic structure from the statistics of large corpora , 1990 .

[21]  寺津 典子,et al.  書評・紹介 N. Chomsky著 Lectures on Government and Binding , 1983 .

[22]  Ivan A. Sag,et al.  Information-based syntax and semantics , 1987 .

[23]  I. M. Stancu-Minasian,et al.  Stochastic Programming: with Multiple Objective Functions , 1985 .

[24]  Aravind K. Joshi,et al.  Lexicalized TAGs, Parsing and Lexicons , 1989, HLT.

[25]  Aravind K. Joshi,et al.  Mathematical and computational aspects of lexicalized grammars , 1990 .

[26]  Daniel J. Rosenkrantz,et al.  Matrix Equations and Normal Forms for Context-Free Grammars , 1967, JACM.

[27]  Walter L. Ruzzo,et al.  An Improved Context-Free Recognizer , 1980, ACM Trans. Program. Lang. Syst..

[28]  Masaru Tomita,et al.  Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems , 1985 .

[29]  R. Słowiński,et al.  Stochastic Versus Fuzzy Approaches to Multiobjective Mathematical Programming under Uncertainty , 1990, Theory and Decision Library.

[30]  A. M. Geoffrion Stochastic Programming with Aspiration or Fractile Criteria , 1967 .

[31]  Hiroaki Ishii,et al.  A GENERALIZED CHANCE CONSTRAINT PROGRAMMING PROBLEM , 1978 .

[32]  Roman Słowiński,et al.  Fuzzy Versus Stochastic Approaches to Multicriteria Linear Programming under Uncertainty , 1988 .

[33]  K. Vijay-Shankar,et al.  SOME COMPUTATIONAL PROPERTIES OF TREE ADJOINING GRAMMERS , 1985, ACL 1985.

[34]  David J. Weir,et al.  Parsing Some Constrained Grammar Formalisms , 1993, Comput. Linguistics.

[35]  Aravind K. Joshi,et al.  Some Computational Properties of Tree Adjoining Grammars , 1985, ACL.

[36]  XTAG Research Group,et al.  A Lexicalized Tree Adjoining Grammar for English , 1998, ArXiv.

[37]  Geoffrey K. Pullum,et al.  Generalized Phrase Structure Grammar , 1985 .

[38]  Richard C. Waters,et al.  Lexicalized Context-Free Grammars , 1993, ACL.

[39]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[40]  Bernard Lang,et al.  The systematic construction of Early Parsers: Application to the production of an O(n^6) Earley Parser for Tree Adjoining Grammars , 1990, Tag.

[41]  A. Celmins Least squares model fitting to fuzzy vector data , 1987 .

[42]  Hidetomo Ichihashi,et al.  Relative modalities and their use in possibilistic linear programming , 1990 .

[43]  Mark Steedman,et al.  Combinatory grammars and parasitic gaps , 1987 .

[44]  Sheila A. Greibach,et al.  A New Normal-Form Theorem for Context-Free Phrase Structure Grammars , 1965, JACM.

[45]  William T. Freeman,et al.  Demonstration of an interactive multimedia environment , 1994, Computer.

[46]  Maurice Gross,et al.  Lexicon-Grammar and the Syntactic Analysis of French , 1984, ACL.

[47]  Stuart M. Shieber,et al.  An Alternative Conception of Tree-Adjoining Derivation , 1992, ACL.

[48]  Eric Brill,et al.  Deducing Linguistic Structure from the Statistics of Large Corpora , 1990, HLT.