Autograding "Explain in Plain English" questions using NLP

Previous research suggests that "Explain in Plain English" (EiPE) code reading activities could play an important role in the development of novice programmers, but EiPE questions aren't heavily used in introductory programming courses because they (traditionally) required manual grading. We present what we believe to be the first automatic grader for EiPE questions and its deployment in a large-enrollment introductory programming course. Based on a set of questions deployed on a computer-based exam, we find that our implementation has an accuracy of 87-89%, which is similar in performance to course teaching assistants trained to perform this task and compares favorably to automatic short answer grading algorithms developed for other domains. In addition, we briefly characterize the kinds of answers that the current autograder fails to score correctly and the kinds of errors made by students.

[1]  Craig Zilles,et al.  Pattern Census: A Characterization of Pattern Usage in Early Programming Courses , 2021, SIGCSE.

[2]  Matthew West,et al.  Strategies for Deploying Unreliable AI Graders in High-Transparency High-Stakes Exams , 2020, AIED.

[3]  Matthew West,et al.  A Validated Scoring Rubric for Explain-in-Plain-English Questions , 2020, SIGCSE.

[4]  Jeremy Levesley,et al.  Automatic Short Answer Grading and Feedback Using Text Mining Methods , 2018, Procedia Computer Science.

[5]  Swati Aggarwal,et al.  Get IT Scored Using AutoSAS - An Automated System for Scoring Short Answers , 2019, AAAI.

[6]  Tejas I. Dhamecha,et al.  Improving Short Answer Grading Using Transformer-Based Pre-training , 2019, AIED.

[7]  Zhiwei Wang,et al.  Automatic Short Answer Grading via Multiway Attention Networks , 2019, AIED.

[8]  Ben Kei Daniel,et al.  A Machine Learning Grading System Using Chatbots , 2019, AIED.

[9]  Timothy Bretl,et al.  Every University Should Have a Computer-Based Testing Facility , 2019, CSEDU.

[10]  Wael Hassan Gomaa,et al.  Ans2vec: A Scoring System for Short Answers , 2019, AMLTA.

[11]  Dastyni Loksa,et al.  A theory of instruction for introductory programming skills , 2019, Comput. Sci. Educ..

[12]  Jacques Duilio Brancher,et al.  Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review , 2018, IBERAMIA.

[13]  Fuzhen Zhuang,et al.  Automatic Chinese Short Answer Grading with Deep Autoencoder , 2018, AIED.

[14]  Bikram Sengupta,et al.  Sentence Level or Token Level Features for Automatic Short Answer Grading?: Use Both , 2018, AIED.

[15]  Craig B. Zilles,et al.  How much randomization is needed to deter collaborative cheating on asynchronous exams? , 2018, L@S.

[16]  Mohsen Rashwan,et al.  Vector Based Techniques for Short Answer Grading , 2016, FLAIRS.

[17]  Benno Stein,et al.  The Eras and Trends of Automatic Short Answer Grading , 2015, International Journal of Artificial Intelligence in Education.

[18]  Sue Fitzgerald,et al.  'explain in plain english' questions revisited: data structures problems , 2014, SIGCSE.

[19]  Chris Brew,et al.  SemEval-2013 Task 7: The Joint Student Response Analysis and 8th Recognizing Textual Entailment Challenge , 2013, *SEMEVAL.

[20]  Sue Fitzgerald,et al.  'Explain in plain English' questions: implications for teaching , 2012, SIGCSE '12.

[21]  Rada Mihalcea,et al.  Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments , 2011, ACL.

[22]  John Sweller,et al.  Cognitive Load Theory , 2020, Encyclopedia of Education and Information Technologies.

[23]  Raymond Lister,et al.  Early relational reasoning and the novice programmer: swapping as the hello world of relational reasoning , 2011, ACE 2011.

[24]  Nicola K. Ferdinand,et al.  Timing Matters: The Impact of Immediate and Delayed Feedback on Artificial Language Learning , 2010, Front. Hum. Neurosci..

[25]  Anne Venables,et al.  A closer look at tracing, explaining and code writing skills in the novice programmer , 2009, ICER '09.

[26]  Colin J. Fidge,et al.  Further evidence of a relationship between explaining, tracing and writing skills in introductory programming , 2009, ITiCSE.

[27]  Raymond Lister,et al.  Relationships between reading, tracing and writing skills in introductory programming , 2008, ICER '08.

[28]  Tony Clear,et al.  An Australasian study of reading and comprehension skills in novice programmers, using the bloom and SOLO taxonomies , 2006 .

[29]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[30]  Robert McCartney,et al.  A multi-national study of reading and tracing skills in novice programmers , 2004, ITiCSE-WGR '04.

[31]  Mark Guzdial,et al.  A multi-national, multi-institutional study of assessment of programming skills of first-year CS students , 2001, ITiCSE-WGR '01.

[32]  J. Pine,et al.  Chunking mechanisms in human learning , 2001, Trends in Cognitive Sciences.

[33]  Mark Guzdial,et al.  A multi-national, multi-institutional study of assessment of programming skills of first-year CS students , 2001, ITiCSE-WGR '01.

[34]  Marcia C. Linn,et al.  Patterns and pedagogy , 1999, SIGCSE '99.

[35]  H. Mandl,et al.  Learning from Worked-Out Examples: The Effects of Example Variability and Elicited Self-Explanations , 1998, Contemporary educational psychology.

[36]  F. Paas,et al.  Cognitive Architecture and Instructional Design , 1998 .

[37]  Leon E. Winslow,et al.  Programming pedagogy—a psychological overview , 1996, SGCS.

[38]  Sandra P. Marshall,et al.  Schemas in Problem Solving , 1995 .

[39]  David Reed,et al.  AAA and CS 1: the applied apprenticeship approach to CS 1 , 1995, SIGCSE.

[40]  Robert S. Rist Schema Creation in Programming , 1989, Cogn. Sci..

[41]  Susan Wiedenbeck,et al.  Novice/Expert Differences in Programming Skills , 1985, Int. J. Man Mach. Stud..

[42]  J. Reitman,et al.  Knowledge organization and skill differences in computer programmers , 1981, Cognitive Psychology.