Estimating Annotation Complexities of Text Using Gaze and Textual Information

The basic requirement of supervised data-driven methods for various NLP tasks like part-of-speech tagging, dependency parsing, machine translation is large-scale annotated data. Since statistical methods have taken places overrule/heuristic methods over the years, text annotation has become an essential NLP research. Annotation refers to the task of manually labeling of text, image, or other data with comments, explanation, tags or markups—for NLP, often carried out by linguists to label raw text. While the outcome of the annotation process, i.e., the labeled data is valuable, capturing user activities may help in understanding the cognitive subprocesses underlying text annotation.

[1]  Dekang Lin On the Structural Complexity of Natural Language Sentences , 1996, COLING.

[2]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[3]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[4]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[5]  Pushpak Bhattacharyya,et al.  A heuristic-based approach for systematic error correction of gaze data for reading , 2012 .

[6]  Shravan Vasishth,et al.  What is the scanpath signature of syntactic reanalysis , 2011 .

[7]  清川 英男,et al.  CHALL, J. S. and DALE, E. (1995) Readability Revisited : The New Dale-Chall Readability Formula., Brookline Books , 1996 .

[8]  Pushpak Bhattacharyya,et al.  Detecting Turnarounds in Sentiment Analysis: Thwarting , 2013, ACL.

[9]  Pushpak Bhattacharyya,et al.  More than meets the eye: Study of Human Cognition in Sense Annotation , 2013, NAACL.

[10]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[11]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[12]  Pushpak Bhattacharyya,et al.  Automatically Predicting Sentence Translation Difficulty , 2013, ACL.

[13]  Jure Leskovec,et al.  From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , 2013, WWW.

[14]  Ellen Riloff,et al.  Sarcasm as Contrast between a Positive Sentiment and Negative Situation , 2013, EMNLP.

[15]  Tim Halverson,et al.  Cleaning up systematic error in eye-tracking data by using required fixation locations , 2002, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[16]  R. Gunning The Fog Index After Twenty Years , 1969 .

[17]  Barbara Dragsted,et al.  Coordination of reading and writing processes in translation , 2010 .

[18]  Alon Lavie,et al.  Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.

[19]  Sophie Rosset,et al.  Modeling the Complexity of Manual Annotation Tasks: a Grid of Analysis , 2012, COLING.

[20]  Michael Carl,et al.  Translog-II: a Program for Recording User Activity Data for Empirical Reading and Writing Research , 2012, LREC.

[21]  Pascual Martínez-Gómez,et al.  Diagnosing Causes of Reading Difficulty using Bayesian Networks , 2013, IJCNLP.

[22]  Bing Liu,et al.  Mining Opinions in Comparative Sentences , 2008, COLING.

[23]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[24]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[25]  Graham G. Scott,et al.  Emotion words affect eye fixations during reading. , 2012, Journal of experimental psychology. Learning, memory, and cognition.

[26]  Andrés Montoyo,et al.  Detecting Implicit Expressions of Sentiment in Text Based on Commonsense Knowledge , 2011, WASSA@ACL.