An Empirical Study of Differences between Conversion Schemes and Annotation Guidelines

We establish quantitative methods for comparing and estimating the quality of dependency annotations or conversion schemes. We use generalized tree-edit distance to measure divergence between annotations and propose theoretical learnability, derivational perplexity and downstream performance for evaluation. We present systematic experiments with treeto-dependency conversions of the PennIII treebank, as well as observations from experiments using treebanks from multiple languages. Our most important observations are: (a) parser bias makes most parsers insensitive to non-local differences between annotations, but (b) choice of annotation nevertheless has significant impact on most downstream applications, and (c) while learnability does not correlate with downstream performance, learnable annotations will lead to more robust performance across domains.

[1]  Y. Singer,et al.  Ultraconservative online algorithms for multiclass problems , 2003 .

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[4]  Evelina Andersson,et al.  Cross-Framework Evaluation for Statistical Parsing , 2012, EACL.

[5]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[6]  Stephan Oepen,et al.  Speculation and Negation: Rules, Rankers, and the Role of Syntax , 2012, CL.

[7]  Roy Schwartz,et al.  Learnability-Based Syntactic Annotation Design , 2012, COLING.

[8]  Anders Søgaard,et al.  On the Derivation Perplexity of Treebanks , 2010 .

[9]  Jakob Elming,et al.  Reordering by Parsing , 2011 .

[10]  Stephan Oepen,et al.  Who Did What to Whom? A Contrastive Study of Syntacto-Semantic Dependencies , 2012, LAW@ACL.

[11]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[12]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[14]  Christopher D. Manning,et al.  Quadratic-Time Dependency Parsing for Machine Translation , 2009, ACL.

[15]  Richard Johansson,et al.  Syntactic and Semantic Structure for Opinion Expression Detection , 2010, CoNLL.

[16]  Roser Morante,et al.  *SEM 2012 Shared Task: Resolving the Scope and Focus of Negation , 2012, *SEMEVAL.

[17]  Richard Johansson,et al.  Dependency-based Syntactic–Semantic Analysis with PropBank and NomBank , 2008, CoNLL.

[18]  Slav Petrov,et al.  Overview of the 2012 Shared Task on Parsing the Web , 2012 .

[19]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[20]  Ryan T. McDonald Discriminative Sentence Compression with Soft Syntactic Evidence , 2006, EACL.

[21]  Erik Velldal,et al.  UiO 2: Sequence-labeling Negation Using Dependency Features , 2012, *SEMEVAL.

[22]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[23]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[24]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsers , 2007 .

[25]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[26]  Wei-Hao Lin,et al.  Are These Documents Written from Different Perspectives? A Test of Different Perspectives Based on Statistical Distribution Divergence , 2006, ACL.

[27]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[28]  Sigrid Klerke,et al.  Down-stream effects of tree-to-dependency conversions , 2013, HLT-NAACL.

[29]  Roser Morante,et al.  Modality and Negation: An Introduction to the Special Issue , 2012, CL.

[30]  Peng Xu,et al.  Using a Dependency Parser to Improve SMT for Subject-Object-Verb Languages , 2009, NAACL.

[31]  Carolyn Penstein Rosé,et al.  Generalizing Dependency Features for Opinion Mining , 2009, ACL.

[32]  Alon Lavie,et al.  Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[33]  Richard Johansson,et al.  Training Parsers on Incompatible Treebanks , 2013, NAACL.