A Fluctuation Smoothing Approach for Unsupervised Automatic Short Answer Grading

We offer a fluctuation smoothing computational approach for unsupervised automatic short answer grading (ASAG) techniques in the educational ecosystem. A major drawback of the existing techniques is the significant effect that variations in model answers could have on their performances. The proposed fluctuation smoothing approach, based on classical sequential pattern mining, exploits lexical overlap in students’ answers to any typical question. We empirically demonstrate using multiple datasets that the proposed approach improves the overall performance and significantly reduces (up to 63%) variation in performance (standard deviation) of unsupervised ASAG techniques. We bring in additional benchmarks such as (a) paraphrasing of model answers and (b) using answers by k top performing students as model answers, to amplify the benefits of the proposed approach.

[1]  Rada Mihalcea,et al.  Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments , 2011, ACL.

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Martin Chodorow,et al.  C-rater: Automated Scoring of Short-Answer Questions , 2003, Comput. Humanit..

[4]  Christian Gütl Moving towards a Fully Automatic Knowledge Assessment Tool , 2008, iJET.

[5]  Carlo Strapparava,et al.  Automatic Assessment of Students' Free-Text Answers Underpinned by the Combination of a BLEU-Inspired Algorithm and Latent Semantic Analysis , 2005, FLAIRS Conference.

[6]  Benno Stein,et al.  The Eras and Trends of Automatic Short Answer Grading , 2015, International Journal of Artificial Intelligence in Education.

[7]  Anne Laurent,et al.  Sequential patterns for text categorization , 2006, Intell. Data Anal..

[8]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[9]  Surendra Prasad,et al.  Automatic Evaluation of Students’ Answers using Syntactically Enhanced LSA , 2003, HLT-NAACL 2003.

[10]  M. Marelli,et al.  SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[11]  Diana Pérez-Marín,et al.  Willow: a system to automatically assess students’ free-text answers by using a combination of shallow NLP techniques , 2011 .

[12]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[13]  Angelo Kyrilov,et al.  Automated assessment of short free-text responses in computer science using latent semantic analysis , 2011, ITiCSE '11.

[14]  Omer Levy,et al.  Recognizing Partial Textual Entailment , 2013, ACL.

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  Randy M. Kaplan,et al.  Using Lexical Semantic Techniques to Classify Free-Responses , 1999 .

[17]  Rada Mihalcea,et al.  Text-to-Text Semantic Similarity for Automatic Short Answer Grading , 2009, EACL.

[18]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[19]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[20]  Enrique Alfonseca,et al.  Automatic Assessment of Open Ended Questions with a Bleu-Inspired Algorithm and Shallow NLP , 2004, EsTAL.

[21]  Arthur C. Graesser,et al.  Using Latent Semantic Analysis to Evaluate the Contributions of Students in AutoTutor , 2000, Interact. Learn. Environ..

[22]  Chris Brew,et al.  SemEval-2013 Task 7: The Joint Student Response Analysis and 8th Recognizing Textual Entailment Challenge , 2013, *SEMEVAL.

[23]  Shourya Roy,et al.  A Perspective on Computer Assisted Assessment Techniques for Short Free-Text Answers , 2015, CAA.

[24]  Peter W. Foltz,et al.  Identifying Patterns For Short Answer Scoring Using Graph-based Lexico-Semantic Text Matching , 2015, BEA@NAACL-HLT.

[25]  Enrique Alfonseca,et al.  Application of the BLEU Method for Evaluating Free-text Answers in an E-learning Environment , 2004, LREC.

[26]  Jana Sukkarieh Using a MaxEnt Classifier for the Automatic Content Scoring of Free‐Text Responses , 2011 .

[27]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[28]  Wael Hassan Gomaa,et al.  Short Answer Grading Using String Similarity And Corpus-Based Similarity , 2012 .

[29]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[30]  Chris Callison-Burch,et al.  SemEval-2015 Task 1: Paraphrase and Semantic Similarity in Twitter (PIT) , 2015, *SEMEVAL.

[31]  Yue Lu,et al.  Integrating word embeddings and traditional NLP features to measure textual entailment and semantic relatedness of sentence pairs , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[32]  Nitin Madnani,et al.  Automated Scoring of a Summary-Writing Task Designed to Measure Reading Comprehension , 2013, BEA@NAACL-HLT.

[33]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .