Scaling Up Writing in the Curriculum: Batch Mode Active Learning for Automated Essay Scoring

Automated essay scoring (AES) allows writing to be assigned in large courses and can provide instant formative feedback to students. However, creating models for AES can be costly, requiring the collection and human scoring of hundreds of essays. We have developed and are piloting a web-based tool that allows instructors to incrementally score responses to enable AES scoring while minimizing the number of essays the instructors must score. Previous work has shown that techniques from the machine learning subfield of active learning can reduce the amount of training data required to create effective AES models. We extend those results to a less idealized scenario: one driven by the instructor's need to score sets of essays, in which the model is trained iteratively using batch mode active learning. We propose a novel approach inspired by a class of topological methods, but with reduced computational requirements, which we refer to as topological maxima. Using actual student data, we show that batch mode active learning is a practical approach to training AES models. Finally, we discuss implications of using this technology for automated customized scoring of writing across the curriculum.

[1]  Nicholas Dronen,et al.  Effective Sampling for Large-scale Automated Writing Evaluation Systems , 2014, L@S.

[2]  Anima Anandkumar,et al.  Deep Active Learning for Named Entity Recognition , 2017, Rep4NLP@ACL.

[3]  Rong Jin,et al.  Batch mode active learning and its application to medical image classification , 2006, ICML.

[4]  Ben Hamner,et al.  Contrasting state-of-the-art automated scoring of essays: analysis , 2012 .

[5]  Andrea Horbach,et al.  Investigating Active Learning for Short-Answer Scoring , 2016, BEA@NAACL-HLT.

[6]  Stephen P. Balfour,et al.  Assessing Writing in MOOCs: Automated Essay Scoring and Calibrated Peer Review™. , 2013 .

[7]  Leonidas J. Guibas,et al.  Analysis of scalar fields over point cloud data , 2009, SODA.

[8]  Vasile Rus,et al.  Judging the Quality of Automatically Generated Gap-fill Question using Active Learning , 2015, BEA@NAACL-HLT.

[9]  L. A. Stone,et al.  Computer Aided Design of Experiments , 1969 .

[10]  Paul B. Diederich,et al.  How to Measure Growth in Writing Ability , 1966 .

[11]  Peter W. Foltz,et al.  An apprenticeship model for human and AI collaborative essay grading , 2019, IUI Workshops.

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Xinyu Dai,et al.  Active Learning with Transfer Learning , 2012, ACL 2012.

[14]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[15]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[16]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[17]  David Cohn,et al.  Active Learning , 2010, Encyclopedia of Machine Learning.

[18]  Andreas Krause,et al.  Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization , 2013, ICML.

[19]  Peter W. Foltz,et al.  Automated Essay Scoring: Applications to Educational Technology , 1999 .

[20]  Jill Burstein,et al.  Handbook of Automated Essay Evaluation Current Applications and New Directions , 2018 .

[21]  Edward W. Wolfe,et al.  Identifying Rater Effects Using Latent Trait Models , 2004 .

[22]  Jill Burstein,et al.  AUTOMATED ESSAY SCORING WITH E‐RATER® V.2.0 , 2004 .

[23]  Daniel Gruhl,et al.  Exploring the Efficiency of Batch Active Learning for Human-in-the-Loop Relation Extraction , 2018, WWW.

[24]  Abhimanyu Das,et al.  Algorithms for subset selection in linear regression , 2008, STOC.

[25]  Martha Palmer,et al.  Good Seed Makes a Good Crop: Accelerating Active Learning Using Language Modeling , 2011, ACL.

[26]  Rong Jin,et al.  Large-scale text categorization by batch mode active learning , 2006, WWW '06.

[27]  U. Hahn,et al.  Reducing class imbalance during active learning for named entity annotation , 2009, K-CAP '09.

[28]  Glencora Borradaile,et al.  Batch Active Learning via Coordinated Matching , 2012, ICML.

[29]  Jill Burstein,et al.  Automated Essay Scoring : A Cross-disciplinary Perspective , 2003 .

[30]  Valerio Pascucci,et al.  Topology-Based Active Learning , 2014 .

[31]  Peter W. Foltz,et al.  Implementation and Applications of the Intelligent Essay Assessor , 2013 .

[32]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[33]  David M. Williamson,et al.  A Framework for Evaluation and Use of Automated Scoring , 2012 .

[34]  Josef Ruppenhofer,et al.  Detecting annotation noise in automatically labelled data , 2017, ACL.

[35]  Ross D. King,et al.  Active Learning for Regression Based on Query by Committee , 2007, IDEAL.

[36]  Steve Graham,et al.  Informing Writing: The Benefits of Formative Assessment. A Report from Carnegie Corporation of New York. , 2011 .

[37]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[38]  William J. Welch,et al.  Computer-aided design of experiments , 1981 .