Parser evaluation using textual entailments

Parser Evaluation using Textual Entailments (PETE) is a shared task in the SemEval-2010 Evaluation Exercises on Semantic Evaluation. The task involves recognizing textual entailments based on syntactic information alone. PETE introduces a new parser evaluation scheme that is formalism independent, less prone to annotation error, and focused on semantically relevant distinctions. This paper describes the PETE task, gives an error analysis of the top-performing Cambridge system, and introduces a standard entailment module that can be used with any parser that outputs Stanford typed dependencies.

[1]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[2]  Daniel Jurafsky,et al.  Parsing to Stanford Dependencies: Trade-offs between Speed and Accuracy , 2010, LREC.

[3]  Roberto Navigli,et al.  SemEval-2007 Task 10: English Lexical Substitution Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[4]  Ted Briscoe,et al.  Corpus Annotation for Parser Evaluation , 1999, ArXiv.

[5]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[6]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[7]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[8]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[9]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[10]  Walt Detmar Meurers,et al.  Detecting Inconsistencies in Treebanks , 2003 .

[11]  Lucy Vanderwende,et al.  What Syntax Can Contribute in the Entailment Task , 2005, MLCW.

[12]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[13]  Ann Bies,et al.  Bracketing Guidelines For Treebank II Style Penn Treebank Project , 1995 .

[14]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[15]  Katrin Erk,et al.  Investigations on Word Senses and Word Usages , 2009, ACL.

[16]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[17]  Joakim Nivre,et al.  MaltEval: an Evaluation and Visualization Tool for Dependency Parsing , 2008, LREC.

[18]  John A. Carroll,et al.  Robust, applied morphological generation , 2000, INLG.

[19]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[20]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[21]  C. Ramazanoglu What can you do with a man , 1992 .

[22]  P MarcusMitchell,et al.  Building a large annotated corpus of English , 1993 .

[23]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[24]  Parser Evaluation,et al.  22nd International Conference on Computational Linguistics Proceedings of the workshop on Cross-Framework and Cross-Domain , 2008 .

[25]  Rens Bod,et al.  A DOP Model for Semantic Interpretation , 1997, ACL.

[26]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[27]  Mary Dalrymple,et al.  The PARC 700 Dependency Bank , 2003, LINC@EACL.

[28]  Chu-Ren Huang,et al.  22nd International Conference on Computational Linguistics , 2008 .

[29]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[30]  Mark Steedman,et al.  Unbounded Dependency Recovery for Parser Evaluation , 2009, EMNLP.

[31]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[32]  Eugene Charniak,et al.  Function tagging , 2004 .

[33]  James R. Curran,et al.  Formalism-Independent Parser Evaluation with CCG and DepBank , 2007, ACL.

[34]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[35]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[36]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[37]  Christian R. Huyck,et al.  A scheme for comparative evaluation of diverse parsing systems , 1998 .

[38]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[39]  Daniel M. Bikel,et al.  Design of a multi-lingual, parallel-processing statistical parsing engine , 2002 .

[40]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[41]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[42]  Jun'ichi Tsujii,et al.  Parser Evaluation Across Frameworks without Format Conversion , 2008, CF+CDPE@COLING.

[43]  Khalil Sima'an,et al.  Accurate Unlexicalized Parsing for Modern Hebrew , 2007, TSD.

[44]  Ido Dagan,et al.  Recognizing textual entailment: Rational, evaluation and approaches , 2009, Natural Language Engineering.

[45]  Dekang Lin,et al.  Dependency-Based Evaluation of Minipar , 2003 .

[46]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[47]  Seth Kulick,et al.  Fully Parsing the Penn Treebank , 2006, NAACL.