Eyes are the Windows to the Soul: Predicting the Rating of Text Quality Using Gaze Behaviour

Predicting a reader’s rating of text quality is a challenging task that involves estimating different subjective aspects of the text, like structure, clarity, etc. Such subjective aspects are better handled using cognitive information. One such source of cognitive information is gaze behaviour. In this paper, we show that gaze behaviour does indeed help in effectively predicting the rating of text quality. To do this, we first we model text quality as a function of three properties - organization, coherence and cohesion. Then, we demonstrate how capturing gaze behaviour helps in predicting each of these properties, and hence the overall quality, by reporting improvements obtained by adding gaze features to traditional textual features for score prediction. We also hypothesize that if a reader has fully understood the text, the corresponding gaze behaviour would give a better indication of the assigned rating, as opposed to partial understanding. Our experiments validate this hypothesis by showing greater agreement between the given rating and the predicted rating when the reader has a full understanding of the text.

[1]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[2]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[3]  Pushpak Bhattacharyya,et al.  Automatically Predicting Sentence Translation Difficulty , 2013, ACL.

[4]  Vincent Ng,et al.  Modeling Argument Strength in Student Essays , 2015, ACL.

[5]  Ani Nenkova,et al.  Automatic Evaluation of Linguistic Quality in Multi-Document Summarization , 2010, ACL.

[6]  Jill Burstein,et al.  Handbook of Automated Essay Evaluation Current Applications and New Directions , 2018 .

[7]  Swapna Somasundaran,et al.  Lexical Chaining for Measuring Discourse Coherence Quality in Test-taker Essays , 2014, COLING.

[8]  Torsten Zesch,et al.  Task-Independent Features for Automated Essay Grading , 2015, BEA@NAACL-HLT.

[9]  Seema Nagar,et al.  Cognition-Cognizant Sentiment Analysis With Multitask Subjectivity Summarization Based on Annotators' Gaze Behavior , 2018, AAAI.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Teun A. van Dijk,et al.  Text and Context: Explorations in the Semantics and Pragmatics of Discourse , 1977 .

[12]  Mirella Lapata,et al.  Edinburgh Research Explorer Modeling Local Coherence: An Entity-Based Approach , 2005 .

[13]  Daniel Marcu,et al.  Discourse Generation Using Utility-Trained Coherence Models , 2006, ACL.

[14]  Alan Kennedy,et al.  Book Review: Eye Tracking: A Comprehensive Guide to Methods and Measures , 2016, Quarterly journal of experimental psychology.

[15]  Sigrid Klerke,et al.  Looking hard: Eye tracking for detecting grammaticality of automatically compressed sentences , 2015, NODALIDA.

[16]  K. Gwet Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters , 2014 .

[17]  Michael Halliday,et al.  Cohesion in English , 1976 .

[18]  William Wresch,et al.  The Imminence of Grading Essays by Computer-25 Years Later , 1993 .

[19]  J. Henderson,et al.  Eye movement control during reading: fixation measures reflect foveal but not parafoveal processing difficulty. , 1993, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[20]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[21]  Helen Yannakoudakis,et al.  Automatic Text Scoring Using Neural Networks , 2016, ACL.

[22]  Pushpak Bhattacharyya,et al.  Measuring Sentiment Annotation Complexity of Text , 2014, ACL.

[23]  Vincent Ng,et al.  Modeling Organization in Student Essays , 2010, EMNLP.

[24]  Anders Søgaard,et al.  Using Gaze to Predict Text Readability , 2017, BEA@EMNLP.

[25]  Pushpak Bhattacharyya,et al.  Leveraging Annotators’ Gaze Behaviour for Coreference Resolution , 2016 .

[26]  Kaveh Taghipour NATIONAL UNIVERSITY OF SINGAPORE School of Computing PH.D DEFENCE - PUBLIC SEMINAR Title: Robust Trait-Specific Essay Scoring Using Neural Networks and Density Estimators , 2017 .

[27]  M. Georgiopoulos,et al.  Feed-forward neural networks , 1994, IEEE Potentials.

[28]  Yue Zhang,et al.  Automatic Features for Essay Scoring – An Empirical Study , 2016, EMNLP.

[29]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[30]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[31]  Pushpak Bhattacharyya,et al.  Predicting Readers' Sarcasm Understandability by Modeling Gaze Behavior , 2016, AAAI.

[32]  Hwee Tou Ng,et al.  A Neural Approach to Automated Essay Scoring , 2016, EMNLP.

[33]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[34]  Jill Burstein,et al.  AUTOMATED ESSAY SCORING WITH E‐RATER® V.2.0 , 2004 .

[35]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[36]  M A Just,et al.  A theory of reading: from eye fixations to comprehension. , 1980, Psychological review.

[37]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[38]  Pushpak Bhattacharyya,et al.  Scanpath Complexity: Modeling Reading Effort Using Gaze Information , 2017, AAAI.