Introducing EVALD - Software Applications for Automatic Evaluation of Discourse in Czech

In the paper, we introduce two software applications for automatic evaluation of coherence in Czech texts called EVALD – Evaluator of Discourse. The first one – EVALD 1.0 – evaluates texts written by native speakers of Czech on a five-step scale commonly used at Czech schools (grade 1 is the best, grade 5 is the worst). The second application is EVALD 1.0 for Foreigners assessing texts by non-native speakers of Czech using six-step scale (A1–C2) according to CEFR. Both appli-cations are available online at https://lindat.mff.cuni.cz/services/evald-foreign/.

[1]  Zdenek Zabokrtský,et al.  Feature Engineering in Maximum Spanning Tree Dependency Parser , 2007, International Conference on Text, Speech and Dialogue.

[2]  D. Long,et al.  Comprehension skill and global coherence: a paradoxical picture of poor comprehenders' abilities. , 2001, Journal of experimental psychology. Learning, memory, and cognition.

[3]  Karen Kukich,et al.  Automated Evaluation of Coherence in Student Essays , .

[4]  Sowmya Vajjala,et al.  Automatic CEFR Level Prediction for Estonian Learner Text , 2014 .

[5]  Walt Detmar Meurers,et al.  The MERLIN corpus: Learner language and the CEFR , 2014, LREC.

[6]  Zdenek Zabokrtský,et al.  Treex - an open-source framework for natural language processing , 2011, ITAT.

[7]  Helen Yannakoudakis,et al.  Modeling coherence in ESOL learner texts , 2012, BEA@NAACL-HLT.

[8]  Srinivas Bangalore,et al.  Evaluation Metrics for Generation , 2000, INLG.

[9]  Daniel Marcu,et al.  Evaluating Multiple Aspects of Coherence in Student Essays , 2004, NAACL.

[10]  Magdalena Rysova,et al.  The Centre and Periphery of Discourse Connectives , 2014, PACLIC.

[11]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12]  Magdalena Rysova,et al.  Secondary Connectives in the Prague Dependency Treebank , 2015, DepLing.

[13]  Jan Hajic,et al.  Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition , 2014, ACL.

[14]  R. Beaugrande,et al.  Introduction to text linguistics , 1981 .

[15]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[16]  Milan Straka,et al.  AKCES 5 (CzeSL-SGT) , 2014 .

[17]  Michael Hoey,et al.  Textual Interaction: An Introduction to Written Discourse Analysis , 2000 .

[18]  Christian Plaunt,et al.  Subtopic structuring for full-length document access , 1993, SIGIR.

[19]  Pavlína Jínová,et al.  Semi-Automatic Annotation of Intra-Sentential Discourse Relations in PDT , 2012 .

[20]  Michael Halliday,et al.  Cohesion in English , 1976 .

[21]  Ondrej Dusek,et al.  The Joy of Parallelism with CzEng 1.0 , 2012, LREC.

[22]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[23]  M. Hoey Signalling in discourse , 1979 .

[24]  Jirí Mírovský,et al.  Automatic evaluation of surface coherence in L2 texts in Czech , 2016, ROCLING.

[25]  P. Gordon,et al.  The interplay of discourse congruence and lexical association during sentence processing: Evidence from ERPs and eye tracking. , 2007, Journal of memory and language.

[26]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[27]  David Alfter,et al.  Classification of Swedish learner essays by CEFR levels , 2016 .

[28]  Arthur C. Graesser,et al.  Select-a-Kibitzer: A Computer Tool that Gives Meaningful Feedback on Student Compositions , 2000, Interact. Learn. Environ..

[29]  Martin Chodorow,et al.  An Unsupervised Method for Detecting Grammatical Errors , 2000, ANLP.

[30]  Robert-Alain de Beaugrande,et al.  Einfuhrung in die Textlinguistik , 1973 .

[31]  猫田 英伸,et al.  Common European Framework of Reference for Languagesの意義を考える : 日本の英語教育関係者の連携のために , 2002 .

[32]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .