Unsupervised Modeling of Topical Relevance in L2 Learner Text

The automated scoring of second-language (L2) learner text along various writing dimensions is an increasingly active research area. In this paper, we focus on determining the topical relevance of an essay to the prompt that elicited it. Given the burden involved in manually assigning scores for use in training supervised prompt-relevance models, we develop unsupervised models and show that they correlate well with human judgements. We show that expanding prompts using topically-related words, via pseudo-relevance modelling, is beneficial and outperforms other distributional techniques. Finally, we incorporate our prompt-relevance models into a supervised essay scoring system that predicts a holistic score and show that it improves its performance.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  Mirella Lapata,et al.  Language Models Based on Semantic Composition , 2009, EMNLP.

[3]  Sylviane Granger,et al.  The International Corpus of Learner English. Version 2. Handbook and CD-Rom , 2009 .

[4]  Tristan Miller,et al.  Essay Assessment with Latent Semantic Analysis , 2003 .

[5]  Jill Burstein,et al.  Handbook of Automated Essay Evaluation Current Applications and New Directions , 2018 .

[6]  D. H I G G I N S,et al.  Identifying off-topic student essays without topic-specific training data † , 2005 .

[7]  Yuanhua Lv,et al.  A Pólya Urn Document Language Model for Improved Information Retrieval , 2015, ACM Trans. Inf. Syst..

[8]  Chien-Liang Liu,et al.  An Unsupervised Automated Essay Scoring System , 2010, IEEE Intelligent Systems.

[9]  Annie Louis,et al.  Off-topic essay detection using short prompt texts , 2010 .

[10]  ChengXiang Zhai,et al.  A comparative study of methods for estimating query language models with pseudo feedback , 2009, CIKM.

[11]  Jill Burstein,et al.  Automated Essay Scoring : A Cross-disciplinary Perspective , 2003 .

[12]  T. Landauer Automatic Essay Assessment , 2003 .

[13]  D. Wiliam Assessment in Education: Principles, Policy & Practice , 2008 .

[14]  E. B. Page Computer Grading of Student Prose, Using Modern Concepts and Software , 1994 .

[15]  D Nicholls,et al.  The Cambridge Learner Corpus-Error coding and analysis , 1999 .

[16]  Ian Blood Automated Essay Scoring: A Literature Review , 2011 .

[17]  Vincent Ng,et al.  Modeling Prompt Adherence in Student Essays , 2014, ACL.

[18]  Daniel Marcu,et al.  Evaluating Multiple Aspects of Coherence in Student Essays , 2004, NAACL.

[19]  William Wresch,et al.  The Imminence of Grading Essays by Computer-25 Years Later , 1993 .

[20]  Helen Yannakoudakis,et al.  A New Dataset and Method for Automatically Grading ESOL Texts , 2011, ACL.

[21]  J. H. Steiger Tests for comparing elements of a correlation matrix. , 1980 .

[22]  Ben Hamner,et al.  Contrasting state-of-the-art automated scoring of essays: analysis , 2012 .

[23]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[24]  Helen Yannakoudakis,et al.  Evaluating the performance of Automated Text Scoring systems , 2015, BEA@NAACL-HLT.

[25]  E. Sutinen,et al.  Evaluation Criteria for Automatic Essay Assessment Systems – There is much more to it than just the correlation , 2008 .

[26]  Helen Yannakoudakis,et al.  Modeling coherence in ESOL learner texts , 2012, BEA@NAACL-HLT.

[27]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[28]  Anders Holst,et al.  Random indexing of text samples for latent semantic analysis , 2000 .

[29]  Diane J. Litman,et al.  Discourse Structure and Performance Analysis: Beyond the Correlation , 2009, SIGDIAL Conference.

[30]  Ted Briscoe,et al.  Automated assessment of ESOL free text examinations , 2010 .

[31]  J. Burstein Sentence similarity measures for essay coherence , 2007 .

[32]  Semire Dikli,et al.  An Overview of Automated Scoring of Essays. , 2006 .

[33]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[34]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[35]  Salvatore Valenti,et al.  An Overview of Current Research on Automated Essay Grading , 2003, J. Inf. Technol. Educ..

[36]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.