Evaluation of text coherence for electronic essay scoring systems

Existing software systems for automated essay scoring can provide NLP researchers with opportunities to test certain theoretical hypotheses, including some derived from Centering Theory. In this study we employ the Educational Testing Service's e-rater essay scoring system to examine whether local discourse coherence, as defined by a measure of Centering Theory's Rough-Shift transitions, might be a significant contributor to the evaluation of essays. Rough-Shifts within students' paragraphs often occur when topics are short-lived and unconnected, and are therefore indicative of poor topic development. We show that adding the Rough-Shift based metric to the system improves its performance significantly, better approximating human scores and providing the capability of valuable instructional feedback to the student. These results indicate that Rough-Shifts do indeed capture a source of incoherence, one that has not been closely examined in the Centering literature. They not only justify Rough-Shifts as a valid transition type, but they also support the original formulation of Centering as a measure of discourse continuity even in pronominal-free text. Finally, our study design, which used a combination of automated and manual NLP techniques, highlights specific areas of NLP research and development needed for engineering practical applications.

[1]  John W. Tukey,et al.  Data Analysis and Regression: A Second Course in Statistics , 1977 .

[2]  Guy Carden,et al.  Backwards anaphora in discourse context , 1982, Journal of Linguistics.

[3]  Carl Pollard,et al.  A Centering Approach to Pronouns , 1987, ACL.

[4]  Jeffrey C. Reynar An Automatic Method of Finding Topic Boundaries , 1994, ACL.

[5]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[6]  Candace L. Sidner,et al.  Towards a computational theory of definite anaphora comprehension in English discourse , 1979 .

[7]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[8]  Eleni Miltsakaki,et al.  Toward an Aposynthesis of Topic Continuity and Intrasentential Anaphora , 2002, Computational Linguistics.

[9]  Peter J. Bickel,et al.  S: An Interactive Environment for Data Analysis and Graphics , 1984 .

[10]  Megumi Kameyama,et al.  Zero anaphora: The case of Japanese , 1990 .

[11]  Eleni Miltsakaki Locating topics in text processing , 1999, CLIN.

[12]  E. B. Page,et al.  The Computer Moves into Essay Grading: Updating the Ancient Test. , 1995 .

[13]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[14]  W. Willett,et al.  Misinterpretation and misuse of the kappa statistic. , 1987, American journal of epidemiology.

[15]  Barbara J. Grosz,et al.  The representation and use of focus in dialogue understanding. , 1977 .

[16]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[17]  F. Daneš Functional Sentence Perspective and the Organization of the Text , 1974 .

[18]  Rebecca J. Passonneau,et al.  Discourse Segmentation by Human and Automated Means , 1997, CL.

[19]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[20]  Scott Weinstein,et al.  Providing a Unified Account of Definite Noun Phrases in Discourse , 1983, ACL.

[21]  Kathleen F. McCoy,et al.  A Methodology for Extending Focusing Frameworks , 1999, Comput. Linguistics.

[22]  Aravind K. Joshi,et al.  Centered Logic: The Role of Entity Centered Sentence Representation in Natural Language Inferencing , 1979, IJCAI.

[23]  William Wresch,et al.  The Imminence of Grading Essays by Computer-25 Years Later , 1993 .

[24]  M. Walker,et al.  A bilateral approach to givenness : A hearer-status algorithm and a centering algorithm , 1996 .

[25]  Eleni Miltsakaki Centering in Greek , 2003 .

[26]  I. Guggenmoos‐Holzmann,et al.  How reliable are chance-corrected measures of agreement? , 1993, Statistics in medicine.

[27]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[28]  Barbara Di Eugenio,et al.  Centering in Italian , 1996, ArXiv.

[29]  Marilyn A. Walker,et al.  Centering, Anaphora Resolution, and Discourse Structure , 1997, ArXiv.

[30]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[31]  Scott Weinstein,et al.  Control of Inference: Role of Some Aspects of Discourse Structure-Centering , 1981, IJCAI.

[32]  Udo Hahn,et al.  Functional Centering - Grounding Referential Coherence in Information Structure , 1999, Comput. Linguistics.

[33]  T. Reinhart Pragmatics and Linguistics: an analysis of Sentence Topics , 1981, Philosophica.

[34]  Hideki Kozima,et al.  Text Segmentation Based on Similarity between Words , 1993, ACL.

[35]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[36]  Karen Van Hoek,et al.  Anaphora and Conceptual Structure , 1997 .

[37]  Megumi Kameyama,et al.  Intrasentential Centering: A Case Study , 1997, ArXiv.

[38]  Bonnie L. Webber,et al.  Structure and Ostension in the Interpretation of Discourse Deixis , 1991, ArXiv.

[39]  Marti A. Hearst Multi-Paragraph Segmentation Expository Text , 1994, ACL.

[40]  R. Quirk A Grammar of contemporary English , 1974 .

[41]  Karen Kukich,et al.  Beyond Automated Essay Scoring , 2000 .

[42]  G. Youmans A New Tool for Discourse Analysis: The Vocabulary-Management Profile. , 1991 .

[43]  U. Turan Null vs. Overt Subjects in Turkish Discourse: A Centering Analysis , 1995 .

[44]  M. Walker,et al.  Centering Theory in Discourse , 1998 .

[45]  Eleni Miltsakaki Effects of Subordination on Referential Form and Interpretation , 2003 .

[46]  Renata Vieira,et al.  A Corpus-based Investigation of Definite Description Use , 1997, CL.

[47]  Udo Hahn,et al.  Functional Centering , 1996, ACL.

[48]  Marilyn A. Walker,et al.  Japanese Discourse and the Process of Centering , 1994, Comput. Linguistics.

[49]  Massimo Poesio,et al.  Specifying the Parameters of Centering Theory: a Corpus-Based Evaluation using Text from Application-Oriented Domains , 2000, ACL.