Tabulation for Multi-Purpose Partial Parsing

Efficient partial parsing systems (chunkers) are urgently required by various natural language application areas because these parsers always produce partially parsed text even when the text does not fully fit existing lexica and grammars. Availability of partially parsed corpora is absolutely necessary for extracting various kinds of information that may then be fed into those systems, thereby increasing their processing power. In this paper, we propose an efficient partial parsing scheme, based on chart parsing, that is flexible enough to support both normal parsing tasks and diagnosis in previously obtained partial parses of possible causes (kinds of faults) that led to those partial, instead of complete, parses. Through the use of the built-in tabulation capabilites of the DyALog system, we implemented a partial parser that runs as fast as the best non-deterministic parsers. In this paper we elaborate on the implementation of two different grammar formalisms: Definite Clause Grammars (DCG) extended with head declarations and Bound Movement Grammars (BMG).

[1]  João Balsa,et al.  Overcoming Incomplete Information in NLP Systems - Verb Subcategorization , 1998, AIMSA.

[2]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[3]  Klaas Sikkel,et al.  Predictive Head-Corner Chart Parsing , 1993, IWPT.

[4]  Fernando Pereira,et al.  Extraposition Grammars , 1981, CL.

[5]  Douglas E. Appelt,et al.  FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text , 1997, ArXiv.

[6]  Juliana Freire,et al.  The XSB Programmer's Manual Version 1.7.1 , 1994 .

[7]  Steven P. Abney Partial parsing via finite-state cascades , 1996, Natural Language Engineering.

[8]  Ted Briscoe,et al.  Can Subcategorisation Probabilities Help a Statistical Parser , 1998, VLC@COLING/ACL.

[9]  Mats Rooth,et al.  Valence Induction with a Head-Lexicalized PCFG , 1998, EMNLP.

[10]  Graeme D. Ritchie,et al.  Completeness Conditions for Mixed Strategy Bidirectional Parsing , 1999, Comput. Linguistics.

[11]  Éric Villemonte de la Clergerie,et al.  LPDA: Another look at Tabulation in Logic Programming , 1994, ICLP.

[12]  Wojciech Skut,et al.  A Maximum-Entropy Partial Parser for Unrestricted Text , 1998, VLC@COLING/ACL.

[13]  Michael Collins,et al.  Prepositional Phrase Attachment through a Backed-off Model , 1995, VLC@ACL.

[14]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[15]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[16]  Adwait Ratnaparkhi Statistical Models for Unsupervised Prepositional Phrase Attachment , 1998, COLING.

[17]  David H. D. Warren,et al.  Definite Clause Grammars for Language Analysis - A Survey of the Formalism and a Comparison with Augmented Transition Networks , 1980, Artif. Intell..

[18]  José Gabriel Pereira Lopes,et al.  Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units , 1999, EPIA.

[19]  José Gabriel Pereira Lopes,et al.  Datalog Grammars for Abductive Syntactic Error Diagnosis and Repair , 1997 .

[20]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[21]  Alexander S. Yeh,et al.  Some Properties of Preposition and Subordinate Conjunction Attachments , 1998, COLING-ACL.