Incorporating Coreference to Automatic Evaluation of Coherence in Essays

The paper contributes to the task of automated evaluation of surface coherence. It introduces a coreference-related extension to the EVALD applications, which aim at evaluating essays produced by native and non-native students learning Czech. Having successfully employed the coreference resolver and coreference-related features, our system outperforms the original EVALD approaches by up to 8% points. The paper also introduces a dataset for non-native speakers’ evaluation, which was collected from multiple corpora and the parts with missing annotation of coherence grade were manually judged. The resulting corpora contains sufficient number of examples for each of the grading levels.

[1]  Zdenek Zabokrtský,et al.  TectoMT: Modular NLP Framework , 2010, IceTAL.

[2]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[3]  Daniel Marcu,et al.  Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays , 2003, IEEE Intell. Syst..

[4]  Eva Hajicová,et al.  Introducing the Prague Discourse Treebank 1.0 , 2013, IJCNLP.

[5]  E. B. Page,et al.  The use of the computer in analyzing student essays , 1968 .

[6]  Kaja Zupanc,et al.  Automated essay evaluation with semantic analysis , 2017, Knowl. Based Syst..

[7]  Sowmya Vajjala,et al.  Automatic CEFR Level Prediction for Estonian Learner Text , 2014 .

[8]  Walt Detmar Meurers,et al.  The MERLIN corpus: Learner language and the CEFR , 2014, LREC.

[9]  Zdenek Zabokrtský,et al.  Feature Engineering in Maximum Spanning Tree Dependency Parser , 2007, International Conference on Text, Speech and Dialogue.

[10]  Marie Mikulová,et al.  Prague Dependency Treebank 3.0 , 2013 .

[11]  Michal Novák,et al.  EVALD 1.0 for Foreigners , 2016 .

[12]  Michal Novák,et al.  Coreference Resolution System Not Only for Czech , 2017, ITAT.

[13]  Jirí Mírovský,et al.  Automatic evaluation of surface coherence in L2 texts in Czech , 2016, ROCLING.

[14]  Jill Burstein,et al.  AUTOMATED ESSAY SCORING WITH E‐RATER® V.2.0 , 2004 .

[15]  Anna Nedoluzhko,et al.  Prague Discourse Treebank 1.0 , 2012 .

[16]  Semire Dikli,et al.  An Overview of Automated Scoring of Essays. , 2006 .

[17]  Michal Novák,et al.  Introducing EVALD - Software Applications for Automatic Evaluation of Discourse in Czech , 2017, RANLP.

[18]  Jan Hajic,et al.  Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition , 2014, ACL.

[19]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[20]  Hartono,et al.  Automated Essay Scoring by Combining Syntactically Enhanced Latent Semantic Analysis and Coreference Resolution (SCOPUS) , 2016 .

[21]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[22]  Torsten Zesch,et al.  Predicting proficiency levels in learner writings by transferring a linguistic complexity model from expert-written coursebooks , 2016, COLING.

[23]  Petr Sgall,et al.  The Meaning Of The Sentence In Its Semantic And Pragmatic Aspects , 1986 .

[24]  G. Udny Yule,et al.  The statistical study of literary vocabulary , 1944 .