Zero Shot Learning for Code Education: Rubric Sampling with Deep Learning Inference

In modern computer science education, massive open online courses (MOOCs) log thousands of hours of data about how students solve coding challenges. Being so rich in data, these platforms have garnered the interest of the machine learning community, with many new algorithms attempting to autonomously provide feedback to help future students learn. But what about those first hundred thousand students? In most educational contexts (i.e. classrooms), assignments do not have enough historical data for supervised learning. In this paper, we introduce a human-in-the-loop "rubric sampling" approach to tackle the "zero shot" feedback challenge. We are able to provide autonomous feedback for the first students working on an introductory programming assignment with accuracy that substantially outperforms data-hungry algorithms and approaches human level fidelity. Rubric sampling requires minimal teacher effort, can associate feedback with specific parts of a student's solution and can articulate a student's misconceptions in the language of the instructor. Deep learning inference enables rubric sampling to further improve as more assignment specific student data is acquired. We demonstrate our results on a novel dataset from this http URL, the world's largest programming education platform.

[1]  Leonidas J. Guibas,et al.  Autonomously Generating Hints by Inferring Problem Solving Policies , 2015, L@S.

[2]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[3]  Kurt VanLehn,et al.  Repair Theory: A Generative Theory of Bugs in Procedural Skills , 1980, Cogn. Sci..

[4]  S. Havlin The distance between Zipf plots , 1995 .

[5]  Barbara Hammer,et al.  The Continuous Hint Factory - Providing Hints in Vast and Sparsely Populated Edit Distance Spaces , 2017, ArXiv.

[6]  Yu-Chiang Frank Wang,et al.  Multi-label Zero-Shot Learning with Structured Knowledge Graphs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Piyush Rai,et al.  Generalized Zero-Shot Learning via Synthesized Examples , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Leonidas J. Guibas,et al.  Learning Program Embeddings to Propagate Feedback on Student Code , 2015, ICML.

[9]  Florin Adrian Bulgarov,et al.  Proposition Entailment in Educational Applications Using Deep Neural Networks , 2018, AAAI.

[10]  Eleanor O'Rourke,et al.  Hint systems may negatively impact performance in educational games , 2014, L@S.

[11]  Tiffany Barnes,et al.  Position paper: Block-based programming should offer intelligent support for learners , 2017, 2017 IEEE Blocks and Beyond Workshop (B&B).

[12]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[13]  Dan Klein,et al.  A* Parsing: Fast Exact Viterbi Parse Selection , 2003, NAACL.

[14]  Sumit Gulwani,et al.  Automatic Diagnosis of Students' Misconceptions in K-8 Mathematics , 2018, CHI.

[15]  Leonidas J. Guibas,et al.  Codewebs: scalable homework search for massive open online programming courses , 2014, WWW.

[16]  Kevin Murphy,et al.  Generative Models of Visually Grounded Imagination , 2017, ICLR.

[17]  Kai Fan,et al.  Zero-Shot Learning via Class-Conditioned Deep Generative Models , 2017, AAAI.

[18]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[19]  ChengXiang Zhai,et al.  An Exploration of Automated Grading of Complex Assignments , 2016, L@S.

[20]  Kenneth R. Koedinger,et al.  Methods for Evaluating Simulated Learners: Examples from SimStudent , 2015, AIED Workshops.

[21]  Mike Wu,et al.  Multimodal Generative Models for Scalable Weakly-Supervised Learning , 2018, NeurIPS.

[22]  Chris Piech,et al.  Learning to Represent Student Knowledge on Programming Exercises Using Deep Learning , 2017, EDM.

[23]  Bernt Schiele,et al.  Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .