Towards Framework-Independent Evaluation of Deep Linguistic Parsers

This paper describes practical issues in the framework-independent evaluation of deep and shallow parsers. We focus on the use of two dependencybased syntactic representation formats in parser evaluation, namely, Carroll et al. (1998)’s Grammatical Relations and de Marneffe et al. (2006)’s Stanford Dependency scheme. Our approach is to convert the output of parsers into these two formats, and measure the accuracy of the resulting converted output. Through the evaluation of an HPSG parser and Penn Treebank phrase structure parsers, we found that mapping between different representation schemes is a non-trivial task that results in lossy conversions that may obscure important differences between different parsing approaches. We discuss sources of disagreements in the representation of syntactic structures in the two dependency-based formats, indicating possible directions for improved framework-independent parser evaluation.

[1]  Jari Björne,et al.  BioInfer: a corpus for information extraction in the biomedical domain , 2007, BMC Bioinformatics.

[2]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[3]  Ted Briscoe,et al.  An introduction to tag sequence grammars and the RASP system parser , 2006 .

[4]  Jun'ichi Tsujii,et al.  HPSG Parsing with Shallow Dependency Constraints , 2007, ACL.

[5]  James R. Curran,et al.  Formalism-Independent Parser Evaluation with CCG and DepBank , 2007, ACL.

[6]  James R. Curran,et al.  Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.

[7]  Ted Briscoe,et al.  Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank , 2006, ACL.

[8]  Tapio Salakoski,et al.  On the unification of syntactic annotations under the Stanford dependency scheme: A case study on BioInfer and GENIA , 2007, BioNLP@ACL.

[9]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[10]  Jun'ichi Tsujii,et al.  Probabilistic Disambiguation Models for Wide-Coverage HPSG Parsing , 2005, ACL.

[11]  Mark Steedman,et al.  Acquiring Compact Lexicalized Grammars from a Cleaner Treebank , 2002, LREC.

[12]  Christopher D. Manning,et al.  The Leaf Projection Path View of Parse Trees: Exploring String Kernels for HPSG Parse Selection , 2004 .

[13]  Andy Way,et al.  Evaluation of an automatic f-structure annotation algorithm against the PARC 700 dependency bank , 2004 .

[14]  Jun'ichi Tsujii,et al.  A log-linear model with an n-gram reference distribution for accurate HPSG parsing , 2007, IWPT.

[15]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[16]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[17]  Robert Malouf,et al.  Wide Coverage Parsing with Stochastic Attribute Value Grammars , 2004 .

[18]  Jun'ichi Tsujii,et al.  Corpus-Oriented Grammar Development for Acquiring a Head-Driven Phrase Structure Grammar from the Penn Treebank , 2004, IJCNLP.

[19]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[20]  Ted Briscoe,et al.  High Precision Extraction of Grammatical Relations , 2001, COLING.

[21]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[22]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[23]  Andy Way,et al.  Treebank-based acquisition of wide-coverage, probabilistic LFGresources: project overview, results and evaluation , 2004 .

[24]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[25]  Stephan Oepen,et al.  Towards holistic grammar engineering and testing : grafting treebank maintenance into the grammar revision cycle. , 2004 .

[26]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[27]  Tsujii Jun'ichi,et al.  Efficient HPSG Parsing with Supertagging and CFG-filtering , 2006 .

[28]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.

[29]  Julia Hockenmaier Parsing with Generative Models of Predicate-Argument Structure , 2003, ACL.

[30]  Mary Dalrymple,et al.  The PARC 700 Dependency Bank , 2003, LINC@EACL.

[31]  Judita Preiss Using Grammatical Relations to Compare Parsers , 2003, EACL.

[32]  Stefan Riezler,et al.  Speed and Accuracy in Shallow and Deep Stochastic Parsing , 2004, NAACL.