Corpus Annotation for Parser Evaluation

We describe a recently developed corpus annotation scheme for evaluating parsers that avoids shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new public-domain corpus of naturally occurring English text. We show how the corpus can be used to evaluate the accuracy of a robust parser, and relate the corpus to extant resources.

[1]  Eugene Charniak,et al.  Tree-Bank Grammars , 1996, AAAI/IAAI, Vol. 2.

[2]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[3]  Geoffrey Sampson English for the computer , 1995 .

[4]  Beth Ann Hockey,et al.  An approach to Robust Partial Parsing and Evaluation Metrics , 1996 .

[5]  Satoshi Sekine,et al.  The Domain Dependence of Parsing , 1997, ANLP.

[6]  Eric Atwell Comparative evaluation of grammatical annotation models , 1996 .

[7]  Ralph Grishman,et al.  Evaluating syntax performance of parser/grammars , 1991 .

[8]  Ralph Grishman,et al.  Evaluating Parsing Strategies Using Standardized Parse Files , 1992, ANLP.

[9]  RockOn Team,et al.  Re: Attenuation compensation in single-photon emission tomography: a comparative evaluation. , 1983, Journal of nuclear medicine : official publication, Society of Nuclear Medicine.

[10]  Lorna Balkan,et al.  TSNLP - Test Suites for Natural Language Processing , 1996, COLING.

[11]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[12]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[13]  Daniel Jurafsky,et al.  How Verb Subcategorization Frequencies Are Affected By Corpus Choice , 1998, COLING.

[14]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[15]  Bob Carpenter,et al.  Probabilistic Parsing using Left Corner Language Models , 1997, IWPT.

[16]  Seth Kulick,et al.  Heuristics and Parse Ranking , 1995, IWPT.

[17]  Geoffrey Leech,et al.  Running a grammar factory: The production of syntactically analysed corpora or “treebanks” , 1991 .

[18]  Dekang Lin,et al.  A dependency-based method for evaluating broad-coverage parsers , 1995, Natural Language Engineering.

[19]  Ted Briscoe,et al.  Can Subcategorisation Probabilities Help a Statistical Parser , 1998, VLC@COLING/ACL.

[20]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.