Search, align, and repair: data-driven feedback generation for introductory programming exercises

This paper introduces the “Search, Align, and Repair” data-driven program repair framework to automate feedback generation for introductory programming exercises. Distinct from existing techniques, our goal is to develop an efficient, fully automated, and problem-agnostic technique for large or MOOC-scale introductory programming courses. We leverage the large amount of available student submissions in such settings and develop new algorithms for identifying similar programs, aligning correct and incorrect programs, and repairing incorrect programs by finding minimal fixes. We have implemented our technique in the Sarfgen system and evaluated it on thousands of real student attempts from the Microsoft-DEV204.1x edX course and the Microsoft CodeHunt platform. Our results show that Sarfgen can, within two seconds on average, generate concise, useful feedback for 89.7% of the incorrect student submissions. It has been integrated with the Microsoft-DEV204.1X edX class and deployed for production use.

[1]  Roderick Bloem,et al.  Program Repair as a Game , 2005, CAV.

[2]  Sarfraz Khurshid,et al.  Specification-Based Program Repair Using SAT , 2011, TACAS.

[3]  Swarat Chaudhuri,et al.  Learning to Grade Student Programs in a Massive Open Online Course , 2014, 2014 IEEE International Conference on Data Mining.

[4]  Anna Philippou,et al.  Tools and Algorithms for the Construction and Analysis of Systems , 2018, Lecture Notes in Computer Science.

[5]  Mayur Naik,et al.  From symptom to cause: localizing errors in counterexample traces , 2003, POPL '03.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Philip J. Guo,et al.  OverCode: visualizing variation in student solutions to programming problems at scale , 2014, ACM Trans. Comput. Hum. Interact..

[8]  Xiangyu Zhang,et al.  Apex: automatic programming assignment error explanation , 2016, OOPSLA.

[9]  Armando Solar-Lezama,et al.  sk_p: a neural program corrector for MOOCs , 2016, SPLASH.

[10]  Claire Le Goues,et al.  A genetic programming approach to automated software repair , 2009, GECCO.

[11]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[12]  Sumit Gulwani,et al.  Automated feedback generation for introductory programming assignments , 2012, PLDI.

[13]  Roderick Bloem,et al.  Finding and Fixing Faults , 2005, CHARME.

[14]  John W. Tukey,et al.  Exploratory data analysis , 1977, Addison-Wesley series in behavioral science : quantitative methods.

[15]  Sumit Gulwani,et al.  Automated clustering and program repair for introductory programming assignments , 2016, PLDI.

[16]  Sumit Gulwani,et al.  Feedback generation for performance problems in introductory programming assignments , 2014, SIGSOFT FSE.

[17]  Parosh Aziz Abdulla,et al.  Proceedings of the 17th international conference on Tools and algorithms for the construction and analysis of systems: part of the joint European conferences on theory and practice of software , 2011 .

[18]  Roderick Bloem,et al.  Automated error localization and correction for imperative programs , 2011, 2011 Formal Methods in Computer-Aided Design (FMCAD).

[19]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[20]  Rishabh Singh,et al.  Qlose: Program Repair with Quantitative Objectives , 2016, CAV.

[21]  W. Eric Wong,et al.  Using Mutation to Automatically Suggest Fixes for Faulty Programs , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[22]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[23]  Rupak Majumdar,et al.  Cause clue clauses: error localization using maximum satisfiability , 2010, PLDI '11.

[24]  Sumit Gulwani,et al.  Semi-supervised verified feedback generation , 2016, SIGSOFT FSE.

[25]  Fan Long,et al.  Automatic patch generation by learning correct code , 2016, POPL.

[26]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[27]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[28]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[29]  Sumit Gulwani,et al.  Learning Syntactic Program Transformations from Examples , 2016, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[30]  Ulrich Junker,et al.  QUICKXPLAIN: Preferred Explanations and Relaxations for Over-Constrained Problems , 2004, AAAI.

[31]  Nikolai Tillmann,et al.  Code hunt: gamifying teaching and learning of computer science at scale , 2014, L@S.

[32]  Andrea Arcuri,et al.  On the automation of fixing software bugs , 2008, ICSE Companion '08.

[33]  Roderick Bloem,et al.  Finding and fixing faults , 2005, J. Comput. Syst. Sci..