Left-corner parsing algorithm for unification grammars

Parsing with unification grammars is inefficient due to the expressive power of the grammars. Most unification-based parsing algorithms are extensions of context-free (CF) parsing algorithms, and few have been specially designed for unification-style grammars. We have developed an efficient parsing algorithm for unification grammars which takes full advantage of the expressiveness of the grammar. Our algorithm (called LC) is a variation of Left-corner parsing, and it exhibits significantly improved average-case performance as compared with previous unification-based parsers. Efficiency of our LC algorithm comes from two factors. First is the representation and architecture of LINK. LINK is a syntax-semantics integrated unification-based system which dynamically combines syntax (grammar) and semantics (domain knowledge). And LINK utilizes all available information at any given point during parsing. Second is the expectation-based Left-corner parsing strategy. By utilizing expectations, the algorithm can eliminate unsuccessful parses which will not fit the left-context (previous word(s) in the sentence). The central focus of this thesis is the formalization and the proof of correctness, of the LC algorithm. To do so, we specify the algorithm using the constraint-based grammar formalism presented in (Shieber, 1992). In the formulation, the LC algorithm is characterized as an optimization of the abstract parsing algorithm developed in (Shieber, 1992). Then by using Shieber's proof of correctness of his algorithm, we prove the correctness of our LC algorithm by reducing LC to his algorithm. In formulating the proof of correctness, we discovered a difficulty in Shieber's as well as the LC algorithm, in which, for certain grammars, the algorithms may spuriously create nonminimal derivations in addition to the minimal ones. As it turns out, the nonminimal derivation problem raises important issues concerning some of the basic notions in unification grammar and unification-based parsing. We discuss this nonminimal derivation problem in depth, including the sources and the possible solutions. Finally, we present the empirical result obtained from running LINK on a corpus of example sentences taken from real-world texts. The results indicate that, for the limited domain texts, LINK achieved a linear time average-case performance. This is a marked improvement over other unification-based parsing algorithms.

[1]  Robert R. Burridge,et al.  Literal Meaning and the Comprehension of Metaphors , 1992, AAAI.

[2]  Ronald M. Kaplan,et al.  The Formal Architecture of Lexical-Functional Grammar , 1989, J. Inf. Sci. Eng..

[3]  Carlo Cecchetto,et al.  Introduction to Government and Binding Theory , 1996 .

[4]  James Kilbury,et al.  A Modification of the Earley-Shieber Algorithm for Direct Parsing of ID/LP Grammars , 1984, GWAI.

[5]  Dale Gerdemann,et al.  The Correct and Efficient Implementation of Appropriateness Specifications for Typed Feature Structures , 1994, COLING.

[6]  Peter Sells,et al.  Lectures on contemporary syntactic theories , 1985 .

[7]  Noriko Tomuro Semi-automatic Induction of Systematic Polysemy from WordNet , 1998, WordNet@ACL/COLING.

[8]  Alex Franz,et al.  A parser for HPSG , 1990 .

[9]  David H. D. Warren,et al.  Definite Clause Grammars for Language Analysis - A Survey of the Formalism and a Comparison with Augmented Transition Networks , 1980, Artif. Intell..

[10]  Ronald M. Kaplan,et al.  The Interface between Phrasal and Functional Constraints , 1993, Comput. Linguistics.

[11]  Ted Briscoe,et al.  A Formalism and Environment for the Development of a Large Grammar of English , 1987, IJCAI.

[12]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[13]  Peter M. Hastings Automatic acquisition of word meaning from context , 1994 .

[14]  Geoffrey K. Pullum,et al.  Generalized Phrase Structure Grammar , 1985 .

[15]  Stuart M. Shieber,et al.  An Introduction to Unification-Based Approaches to Grammar , 1986, CSLI Lecture Notes.

[16]  Stuart M. Shieber,et al.  Principles and Implementation of Deductive Parsing , 1994, J. Log. Program..

[17]  Lauri Karttunen,et al.  D-PATR: A Development Environment for Unification-Based Grammars , 1986, COLING.

[18]  M. Baltin,et al.  The Mental representation of grammatical relations , 1985 .

[19]  David H. D. Warren,et al.  Parsing as Deduction , 1983, ACL.

[20]  John T. Maxwell,et al.  Formal issues in lexical-functional grammar , 1998 .

[21]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[22]  Ivan A. Sag,et al.  Information-Based Syntax and Semantics: Volume 1, Fundamentals , 1987 .

[23]  Gertjan van Noord An Efficient Implementation of the Head-Corner Parser , 1997, CL.

[24]  Robert T. Kasper,et al.  A Logical Semantics for Feature Structures , 1986, ACL.

[25]  Gertjan van Noord,et al.  Reversibility in Natural Language Processing , 1993 .

[26]  F. Heny,et al.  An Introduction to the Principles of Transformational Syntax , 1975 .

[27]  Douglas E. Appelt,et al.  Bidirectional Grammars and the Design of Natural Language Generation Systems , 1987, TINLAP.

[28]  John F. Sowa,et al.  Building large knowledge-based systems: Representation and inference in the cyc project: D.B. Lenat and R.V. Guha , 1993 .

[29]  Martin Kay,et al.  Head-Driven Parsing , 1989, IWPT.

[30]  John A. Carroll Practical unification-based parsing of Natural Language , 1993 .

[31]  Douglas E. Appelt,et al.  FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text , 1997, ArXiv.

[32]  Steven L. Lytinen Dynamically Combining Syntax and Semantics in Natural Language Processing , 1986, AAAI.

[33]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[34]  Mitchell P. Marcus,et al.  A theory of syntactic recognition for natural language , 1979 .

[35]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[36]  Stuart M. Shieber,et al.  A Simple Reconstruction of GPSG , 1986, COLING.

[37]  Robert C. Berwick,et al.  Computational complexity and natural language , 1987 .

[38]  Paul John King,et al.  Typed Feature Structures as Descriptions , 1994, COLING.

[39]  Stuart M. Shieber,et al.  A Uniform Architecture for Parsing and Generation , 1988, COLING.

[40]  Roland Seiffert Unification-ID/LP Grammars: Formalization and Parsing , 1991, Text Understanding in LILOG.

[41]  Liliane Haegeman,et al.  Introduction to Government and Binding Theory , 1991 .

[42]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[43]  Noriko Tomuro Maximizing Top-down Constraints for Unification-based Systems , 1996, ACL.

[44]  Stuart M. Shieber,et al.  Prolog and Natural-Language Analysis , 1987 .

[45]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[46]  Ronald C. Shank,et al.  Theoretical Issues in Natural Language Processing , 1975 .

[47]  Stuart M. Shieber,et al.  Using Restriction to Extend Parsing Algorithms for Complex-Feature-Based Formalisms , 1985, ACL.

[48]  John A. Carroll Relating Complexity to Practical Performance in Parsing With Wide-Coverage Unification Grammars , 1994, ACL.

[49]  Frank Morawietz,et al.  Formalization and Parsing of Typed Unification-Based ID/LP Grammars , 1995, ArXiv.

[50]  Martin Kay,et al.  Algorithm schemata and data structures in syntactic processing , 1986 .

[51]  Gosse Bouma Feature Structures and Nonmonotonicity , 1992, Comput. Linguistics.

[52]  Steven L. Lytinen,et al.  ULINK: A Semantics-Driven Approach to Understanding Ungrammatical Input , 1991, AAAI.

[53]  Anuj Dawar,et al.  An Interpretation of Negation in Feature Structure Descriptions , 1990, Comput. Linguistics.

[54]  Ramanathan V. Guha,et al.  Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project , 1990 .

[55]  Rémi Zajac,et al.  Inheritance and Constraint-Based Grammar Formalisms , 1992, Comput. Linguistics.

[56]  Martin Kay,et al.  Functional Unification Grammar: A Formalism for Machine Translation , 1984, ACL.

[57]  Steven L. Lytinen,et al.  A unification-based, integrated natural language processing system , 1992 .

[58]  Klaas Sikkel,et al.  Parsing Schemata , 1997, Texts in Theoretical Computer Science An EATCS Series.

[59]  Robert A. Kowalski,et al.  Algorithm = logic + control , 1979, CACM.

[60]  H. Alshawi,et al.  The Core Language Engine , 1994 .

[61]  Bob Carpenter,et al.  The logic of typed feature structures , 1992 .

[62]  Christian R. Huyck,et al.  Description of the LINK system used for MUC-5 , 1993, MUC.

[63]  Ted Briscoe,et al.  Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars , 1993, CL.

[64]  Sergei Nirenburg,et al.  The KBMT project : a case study in knowledge-based machine translation , 1991 .

[65]  Jerry R. Hobbs,et al.  Interpretation as Abduction , 1993, Artif. Intell..

[66]  Dale Gerdemann,et al.  Off-line Optimization for Earley-style HPSG Processing , 1995, EACL.

[67]  C. Pollard,et al.  Center for the Study of Language and Information , 2022 .

[68]  Glenn D. Blank A finite and real-time processor for natural language , 1989, CACM.

[69]  Robert Dale,et al.  Towards Robust PATR , 1992, COLING.

[70]  Patrick Shann Experiments with GLR and Chart Parsing , 1991 .

[71]  Stuart M. Shieber,et al.  Constraint-based grammar formalisms , 1992 .

[72]  Mihai Nadin T. Winograd, Language as a Cognitive Process, Volume I: Syntax , 1985, Artif. Intell..

[73]  Jesús A. López,et al.  Generalized LR parsing , 2004, Machine Translation.