Evaluating State-of-the-Art Treebank-style Parsers for Coh-Metrix and Other Learning Technology Environments

This paper evaluates a series of freely available, state-of-the-art parsers on a standard benchmark as well as with respect to a set of data relevant for measuring text cohesion. We outline advantages and disadvantages of existing technologies and make recommendations. Our performance report uses traditional measures based on a gold standard as well as novel dimensions for parsing evaluation. To our knowledge this is the first attempt to evaluate parsers accross genres and grade levels for the implementation in learning technology.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Marjan Jeannette Grootveld Parsing coordination generatively , 1994 .

[3]  Jerry R. Hobbs Resolving pronoun references , 1986 .

[4]  V. Rus,et al.  Across-genres and empirical evaluation of stateof- the-art treebank-style parsers , 2005 .

[5]  Ellen M. Voorhees,et al.  Overview of TREC 2003 , 2003, TREC.

[6]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[7]  Daniel M. Bikel,et al.  Intricacies of Collins’ Parsing Model , 2004, CL.

[8]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[9]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[10]  Ralph Grishman,et al.  Evaluating Parsing Strategies Using Standardized Parse Files , 1992, ANLP.

[11]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[12]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[13]  Ralph Grishman,et al.  A Corpus-based Probabilistic Grammar with Only Two Non-terminals , 1995, IWPT.

[14]  Dekang Lin,et al.  A dependency-based method for evaluating broad-coverage parsers , 1995, Natural Language Engineering.

[15]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[16]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[17]  Arthur C. Graesser,et al.  Coh-Metrix: Analysis of text on cohesion and language , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[18]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Prepositional Phrase Attachment , 1994, HLT.

[19]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[20]  Michael Collins,et al.  Prepositional Phrase Attachment through a Backed-off Model , 1995, VLC@ACL.

[21]  Yoram Singer,et al.  Boosting Applied to Tagging and PP Attachment , 1999, EMNLP.