Transforming a Chunker to a Parser

Ever since the landmark paper Ramshaw and Marcus (1995), machine learning systems have been used successfully for identifying base phrases (chunks), the bottom constituents of a parse tree. We expand a state-of-the-art chunking algorithm to a bottom-up parser by recursively applying the chunker to its own output. After testing different training configurations we obtain a reasonable parser which is tested against a standard data set. Its performance falls behind that of current state-of-the-art parsers. We give some suggestions for modifications of the parser which may lead to future performance improvements.

[1]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[2]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[3]  Rens Bod,et al.  Parsing with the Shortest Derivation , 2000, COLING.

[4]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[5]  Thorsten Brants,et al.  Cascaded Markov Models , 1999, EACL.

[6]  Dan Roth,et al.  A Learning Approach to Shallow Parsing , 1999, EMNLP.

[7]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[8]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[9]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[10]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[11]  Erik F. Tjong Kim Sang,et al.  Noun Phrase Recognition by System Combination , 2000, ANLP.

[12]  Erik F. Tjong Kim Sang,et al.  Text Chunking by System Combination , 2000, CoNLL/LLL.

[13]  John D. Lafferty,et al.  Towards History-based Grammars: Using Richer Models for Probabilistic Parsing , 1993, ACL.

[14]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[15]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[16]  Mitchell P. Marcus,et al.  Maximum entropy models for natural language ambiguity resolution , 1998 .