Semi-supervised verified feedback generation

Students have enthusiastically taken to online programming lessons and contests. Unfortunately, they tend to struggle due to lack of personalized feedback. There is an urgent need of program analysis and repair techniques capable of handling both the scale and variations in student submissions, while ensuring quality of feedback. Towards this goal, we present a novel methodology called semi-supervised verified feedback generation. We cluster submissions by solution strategy and ask the instructor to identify or add a correct submission in each cluster. We then verify every submission in a cluster against the instructor-validated submission in the same cluster. If faults are detected in the submission then feedback suggesting fixes to them is generated. Clustering reduces the burden on the instructor and also the variations that have to be handled during feedback generation. The verified feedback generation ensures that only correct feedback is generated. We implemented a tool, named CoderAssist, based on this approach and evaluated it on dynamic programming assignments. We have designed a novel counter-example guided feedback generation algorithm capable of suggesting fixes to all faults in a submission. In an evaluation on 2226 submissions to 4 problems, CoderAssist could generate verified feedback for 1911 (85%) submissions in 1.6s each on an average. It does a good job of reducing the burden on the instructor. Only one submission had to be manually validated or added for every 16 submissions.

[1]  Roderick Bloem,et al.  Program Repair as a Game , 2005, CAV.

[2]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[3]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[4]  Fan Long,et al.  Automatic patch generation by learning correct code , 2016, POPL.

[5]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[6]  Xibin Zhu,et al.  Cluster Based Feedback Provision Strategies in Intelligent Tutoring Systems , 2012, ITS.

[7]  E. Allen Emerson,et al.  Cost-Aware Automatic Program Repair , 2013, SAS.

[8]  Amir Pnueli,et al.  Translation Validation , 1998, TACAS.

[9]  Amir Pnueli,et al.  TVOC: A Translation Validator for Optimizing Compilers , 2005, CAV.

[10]  Varun Aggarwal,et al.  A system to grade computer programming skills using machine learning , 2014, KDD.

[11]  R. Bellman Dynamic programming. , 1957, Science.

[12]  Mark Harman,et al.  Automated software transplantation , 2015, ISSTA.

[13]  Sumit Gulwani,et al.  Oracle-guided component-based program synthesis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[14]  Bertrand Meyer,et al.  Code-based automated program fixing , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[15]  Thomas Ball,et al.  Modular and verified automatic program repair , 2012, OOPSLA '12.

[16]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[17]  Abhik Roychoudhury,et al.  Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[18]  Yam San Chee,et al.  Transformation-Based Diagnosis of Student Programs for Programming Tutoring Systems , 2003, IEEE Trans. Software Eng..

[19]  Sarfraz Khurshid,et al.  Specification-Based Program Repair Using SAT , 2011, TACAS.

[20]  Petri Ihantola,et al.  Review of recent systems for automatic assessment of programming assignments , 2010, Koli Calling.

[21]  Leonidas J. Guibas,et al.  Learning Program Embeddings to Propagate Feedback on Student Code , 2015, ICML.

[22]  Fan Long,et al.  Staged program repair with condition synthesis , 2015, ESEC/SIGSOFT FSE.

[23]  Alfred V. Aho,et al.  , “Compilers- Principles, Techniques, and Tools”, Pearson Education Asia, 2007. , 2015 .

[24]  George C. Necula,et al.  Translation validation for an optimizing compiler , 2000, PLDI '00.

[25]  Leonidas J. Guibas,et al.  Syntactic and Functional Variability of a Million Code Submissions in a Machine Learning MOOC , 2013, AIED Workshops.

[26]  Andrea Arcuri,et al.  On the automation of fixing software bugs , 2008, ICSE Companion '08.

[27]  Kenneth R. Koedinger,et al.  Automatic Generation of Programming Feedback; A Data-Driven Approach , 2013, AIED Workshops.

[28]  Neelam Gupta,et al.  Automated Debugging Using Path-Based Weakest Preconditions , 2004, FASE.

[29]  Mark Harman,et al.  Genetic programming for Reverse Engineering , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[30]  Alessandro Orso,et al.  MintHint: automated synthesis of repair hints , 2013, ICSE.

[31]  Philip J. Guo,et al.  OverCode: visualizing variation in student solutions to programming problems at scale , 2014, ACM Trans. Comput. Hum. Interact..

[32]  BodikRastislav,et al.  Synthesis of first-order dynamic programming algorithms , 2011 .

[33]  Sumit Gulwani,et al.  Bound Analysis using Backward Symbolic Execution , 2009 .

[34]  Tao Wang,et al.  Convolutional Neural Networks over Tree Structures for Programming Language Processing , 2014, AAAI.

[35]  Zhi Jin,et al.  Building Program Vector Representations for Deep Learning , 2014, KSEM.

[36]  W. Eric Wong,et al.  Using Mutation to Automatically Suggest Fixes for Faulty Programs , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[37]  Abhik Roychoudhury,et al.  DirectFix: Looking for Simple Program Repairs , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[38]  Andreas Zeller,et al.  Automated Fixing of Programs with Contracts , 2014 .

[39]  Andrei Voronkov,et al.  Finding Basic Block and Variable Correspondence , 2005, SAS.

[40]  Sumit Gulwani,et al.  Automated Grading of DFA Constructions , 2013, IJCAI.

[41]  Elena L. Glassman,et al.  Feature engineering for clustering student solutions , 2014, L@S.

[42]  Jean-Pierre H. Laurent,et al.  LAURA, A System to Debug Student Programs , 1980, Artif. Intell..

[43]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[44]  Roderick Bloem,et al.  Automated error localization and correction for imperative programs , 2011, 2011 Formal Methods in Computer-Aided Design (FMCAD).

[45]  Sumit Gulwani,et al.  Automated feedback generation for introductory programming assignments , 2012, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.

[46]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[47]  Sumit Gulwani,et al.  Feedback generation for performance problems in introductory programming assignments , 2014, SIGSOFT FSE.

[48]  Eran Yahav,et al.  Abstract semantic differencing via speculative correlation , 2014, OOPSLA.

[49]  Leonidas J. Guibas,et al.  Codewebs: scalable homework search for massive open online programming courses , 2014, WWW.