论文信息 - Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output - 字舞流文

Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output

Constituency parser performance is primarily interpreted through a single metric, F-score on WSJ section 23, that conveys no linguistic information regarding the remaining errors. We classify errors within a set of linguistically meaningful types using tree transformations that repair groups of errors together. We use this analysis to answer a range of questions about parser behaviour, including what linguistic constructions are difficult for state-of-the-art parsers, what types of errors are being resolved by rerankers, and what types are introduced when parsing out-of-domain text.

Dan Klein | James R. Curran | David Hall | Jonathan K. Kummerfeld | D. Klein | J. Curran | David Hall

[1] Emily M. Bender,et al. Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus , 2011, EMNLP.

[2] Joakim Nivre,et al. Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[3] Jun'ichi Tsujii,et al. Descriptive and Empirical Approaches to Capturing Underlying Dependencies among Parsing Errors , 2009, EMNLP.

[4] Michael Collins,et al. Discriminative Reranking for Natural Language Parsing , 2000, CL.

[5] Stephan Oepen,et al. Parser Evaluation Using Elementary Dependency Matching , 2011, IWPT.

[6] Eugene Charniak,et al. Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[7] Treebank Penn,et al. Linguistic Data Consortium , 1999 .

[8] Daniel Gildea,et al. Corpus Variation and Parser Performance , 2001, EMNLP.

[9] James R. Curran,et al. Reranking a wide-coverage ccg parser , 2010, ALTA.

[10] Chris Quirk,et al. The impact of parse quality on syntactically-informed statistical machine translation , 2006, EMNLP.

[11] Dan Klein,et al. Accurate Unlexicalized Parsing , 2003, ACL.

[12] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[13] Andrew B. Clegg,et al. Evaluating and Integrating Treebank Parsers on a Biomedical Corpus , 2005, ACL 2005.

[14] Jun'ichi Tsujii,et al. Task-oriented Evaluation of Syntactic Parsers and Their Representations , 2008, ACL.

[15] Eugene Charniak,et al. Effective Self-Training for Parsing , 2006, NAACL.

[16] Ted Briscoe,et al. Relational evaluation schemes , 2002 .

[17] Liang Huang,et al. Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[18] Joakim Nivre,et al. Evaluation of Dependency Parsers on Unbounded Dependencies , 2010, COLING.

[19] Daniel Jurafsky,et al. Parsing to Stanford Dependencies: Trade-offs between Speed and Accuracy , 2010, LREC.

[20] Ted Briscoe,et al. Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank , 2006, ACL.

[21] Jun'ichi Tsujii,et al. Evaluating Impact of Re-training a Lexical Disambiguation Model on Domain Adaptation of an HPSG Parser , 2007, Trends in Parsing Technology.

[22] Brian Roark,et al. Beam-Width Prediction for Efficient Context-Free Parsing , 2011, ACL.

[23] Adam Lopez,et al. A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing , 2011, ACL.

[24] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[25] Ralph Grishman,et al. A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[26] James Henderson. Inducing History Representations for Broad Coverage Statistical Parsing , 2003, HLT-NAACL.

[27] Dan Klein,et al. Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[28] Stephen Clark,et al. Evaluating a Wide-Coverage CCG Parser , 2013 .

[29] Dan Klein,et al. Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[30] Deniz Yuret,et al. SemEval-2010 Task 12: Parser Evaluation Using Textual Entailments , 2010, *SEMEVAL.

[31] James Henderson,et al. Discriminative Training of a Neural Network Statistical Parser , 2004, ACL.

[32] Ted Briscoe,et al. Parser evaluation: a survey and a new proposal , 1998, LREC.

[33] Brian Roark,et al. Efficient Matrix-Encoded Grammars and Low Latency Parallelization Strategies for CYK , 2011, IWPT.

[34] Michael Collins,et al. Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[35] Daniel M. Bikel,et al. Intricacies of Collins’ Parsing Model , 2004, CL.

[36] Josef van Genabith,et al. Parser Evaluation and the BNC: Evaluating 4 constituency parsers with 3 metrics , 2008, LREC.

[37] Eugene Charniak,et al. A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[38] Dekang Lin,et al. A dependency-based method for evaluating broad-coverage parsers , 1995, Natural Language Engineering.

[39] Mary Dalrymple,et al. The PARC 700 Dependency Bank , 2003, LINC@EACL.

[40] Kun Yu,et al. Analysis of the Difficulties in Chinese Deep Parsing , 2011, IWPT.

[41] Dan Klein,et al. Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.