Anchoring and Agreement in Syntactic Annotations

We present a study on two key characteristics of human syntactic annotations: anchoring and agreement. Anchoring is a well known cognitive bias in human decision making, where judgments are drawn towards pre-existing values. We study the influence of anchoring on a standard approach to creation of syntactic resources where syntactic annotations are obtained via human editing of tagger and parser output. Our experiments demonstrate a clear anchoring effect and reveal unwanted consequences, including overestimation of parsing performance and lower quality of annotations in comparison with human-based annotations. Using sentences from the Penn Treebank WSJ, we also report systematically obtained inter-annotator agreement estimates for English dependency parsing. Our agreement results control for parser bias, and are consequential in that they are on par with state of the art parsing performance for English newswire. We discuss the impact of our findings on strategies for future annotation efforts and parser evaluations.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Regina Barzilay,et al.  Low-Rank Tensors for Scoring Dependency Structures , 2014, ACL.

[3]  Thomas Mussweiler,et al.  Numeric Judgments under Uncertainty: The Role of Knowledge in Anchoring , 2000 .

[4]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[5]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.

[6]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[7]  Roy Schwartz,et al.  Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation , 2011, ACL.

[8]  Geoffrey Sampson,et al.  Definitional, personal, and mechanical constraints on part of speech annotation performance , 2006, Nat. Lang. Eng..

[9]  A. Furnham,et al.  A literature review of the anchoring effect , 2011 .

[10]  Helen Yannakoudakis,et al.  A New Dataset and Method for Automatically Grading ESOL Texts , 2011, ACL.

[11]  Arne Skjærholt Influence of preprocessing on dependency syntax annotation: speed and agreement , 2013, LAW@ACL.

[12]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[13]  Boris Katz,et al.  Universal Dependencies for Learner English , 2016, ACL.

[14]  Sabine Brants,et al.  The TIGER Treebank , 2001 .

[15]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[16]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[17]  Geoffrey Sampson,et al.  Definitional and human constraints on structural annotation of English* , 2008, Natural Language Engineering.

[18]  Kim Gerdes Collaborative Dependency Annotation , 2013, DepLing.

[19]  Ted Briscoe,et al.  Corpus Annotation for Parser Evaluation , 1999, ArXiv.

[20]  Samuel R. Bowman,et al.  A Gold Standard Dependency Corpus for English , 2014, LREC.

[21]  Benoît Sagot,et al.  Influence of Pre-Annotation on POS-Tagged Corpus Development , 2010, Linguistic Annotation Workshop.

[22]  Geoffrey Sampson,et al.  English for the Computer: The SUSANNE Corpus and Analytic Scheme , 1995, Computational Linguistics.

[23]  Marilyn A. Walker,et al.  A Dependency Treebank for English , 2002, LREC.

[24]  M. Maamouri,et al.  The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .

[25]  Joakim Nivre,et al.  Universal Stanford dependencies: A cross-linguistic typology , 2014, LREC.

[26]  Barbara Plank,et al.  Do dependency parsing metrics correlate with human judgments? , 2015, CoNLL.

[27]  Dirk Hovy,et al.  Linguistically debatable or just plain wrong? , 2014, ACL.

[28]  Alexandra Kinyon,et al.  Building a Treebank for French , 2000, LREC.

[29]  Arne Skjaerholt Influence of preprocessing on dependency syntax annotation: speed and agreement , 2013, LAW 2013.

[30]  Timothy D. Wilson,et al.  A new look at anchoring effects: basic anchoring and its antecedents. , 1996, Journal of experimental psychology. General.

[31]  Beatrice Santorini Part-of-speech tagging guidelines for the penn treebank project , 1990 .

[32]  Beatrice Santorini,et al.  Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd Revision) , 1990 .