Automatic Grading and Feedback using Program Repair for Introductory Programming Courses

We present GradeIT, a system that combines the dual objectives of automated grading and program repairing for introductory programming courses (CS1). Syntax errors pose a significant challenge for testcase-based grading as it is difficult to differentiate between a submission that is almost correct and has some minor syntax errors and another submission that is completely off-the-mark. GradeIT also uses program repair to help in grading submissions that do not compile. This enables running testcases on submissions containing minor syntax errors, thereby awarding partial marks for these submissions (which, without repair, do not compile successfully and, hence, do not pass any testcase). Our experiments on 15613 submissions show that GradeIT results are comparable to manual grading by teaching assistants (TAs), and do not suffer from unintentional variability that happens when multiple TAs grade the same assignment. The repairs performed by GradeIT enabled successful compilation of 56% of the submissions having compilation errors, and resulted in an improvement in marks for 11% of these submissions.

[1]  Philip J. Guo,et al.  OverCode: visualizing variation in student solutions to programming problems at scale , 2014, ACM Trans. Comput. Hum. Interact..

[2]  Barbara Cutler,et al.  Submitty: An Open Source, Highly-Configurable Platform for Grading of Programming Assignments (Abstract Only) , 2017, SIGCSE.

[3]  Sumit Gulwani,et al.  Automated feedback generation for introductory programming assignments , 2012, PLDI.

[4]  Rishabh Singh,et al.  Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks , 2016, ArXiv.

[5]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[6]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[7]  Armando Solar-Lezama,et al.  sk_p: a neural program corrector for MOOCs , 2016, SPLASH.

[8]  Sumit Gulwani,et al.  Feedback generation for performance problems in introductory programming assignments , 2014, SIGSOFT FSE.

[9]  Sumit Gulwani,et al.  Prutor: A System for Tutoring CS1 and Collecting Student Programs for Analysis , 2016, ArXiv.

[10]  Leonidas J. Guibas,et al.  Codewebs: scalable homework search for massive open online programming courses , 2014, WWW.

[11]  Rahul Gupta,et al.  DeepFix: Fixing Common C Language Errors by Deep Learning , 2017, AAAI.

[12]  Koushik Sen,et al.  Heuristics for Scalable Dynamic Test Generation , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[13]  Steven S. Lumetta,et al.  Automated Feedback Framework for Introductory Programming Courses , 2016, ITiCSE.

[14]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[15]  Sumit Gulwani,et al.  Semi-supervised verified feedback generation , 2016, SIGSOFT FSE.

[16]  V. Javier Traver,et al.  On Compiler Error Messages: What They Say and What They Mean , 2010, Adv. Hum. Comput. Interact..

[17]  Bertrand Meyer,et al.  Compiler error messages: what can help novices? , 2008, SIGCSE '08.

[18]  Sumit Gulwani,et al.  Automated clustering and program repair for introductory programming assignments , 2016, PLDI.

[19]  Lars-Åke Fredlund,et al.  Automatic Grading of Programming Exercises using Property-Based Testing , 2016, ITiCSE.