Identifying typical approaches and errors in Prolog programming with argument-based machine learning

Abstract Students learn programming much faster when they receive feedback. However, in programming courses with high student-teacher ratios, it is practically impossible to provide feedback to all homeworks submitted by students. In this paper, we propose a data-driven tool for semi-automatic identification of typical approaches and errors in student solutions. Having a list of frequent errors, a teacher can prepare common feedback to all students that explains the difficult concepts. We present the problem as supervised rule learning, where each rule corresponds to a specific approach or error. We use correct and incorrect submitted programs as the learning examples, where patterns in abstract syntax trees are used as attributes. As the space of all possible patterns is immense, we needed the help of experts to select relevant patterns. To elicit knowledge from the experts, we used the argument-based machine learning (ABML) method, in which an expert and ABML interactively exchange arguments until the model is good enough. We provide a step-by-step demonstration of the ABML process, present examples of ABML questions and corresponding expert’s answers, and interpret some of the induced rules. The evaluation on 42 Prolog exercises further shows the usefulness of the knowledge elicitation process, as the models constructed using ABML achieve significantly better accuracy than the models learned from human-defined patterns or from automatically extracted patterns.

[1]  Ivan Bratko,et al.  Learning to Explain with ABML , 2010, ExaCt.

[2]  Rui Abreu,et al.  A Survey on Software Fault Localization , 2016, IEEE Transactions on Software Engineering.

[3]  Frans Coenen,et al.  A survey of frequent subgraph mining algorithms , 2012, The Knowledge Engineering Review.

[4]  Philip J. Guo Online python tutor: embeddable web-based program visualization for cs education , 2013, SIGCSE '13.

[5]  Nguyen-Thinh Le,et al.  A Classification of Adaptive Feedback in Educational Systems for Programming , 2016, Syst..

[6]  Leonidas J. Guibas,et al.  Codewebs: scalable homework search for massive open online programming courses , 2014, WWW.

[7]  Thomas G. Dietterich,et al.  Interacting meaningfully with machine learning systems: Three experiments , 2009, Int. J. Hum. Comput. Stud..

[8]  Kenneth R. Koedinger,et al.  Teaching the Teacher: Tutoring SimStudent Leads to More Effective Cognitive Tutor Authoring , 2014, International Journal of Artificial Intelligence in Education.

[9]  Jerry Alan Fails,et al.  Interactive machine learning , 2003, IUI '03.

[10]  Vincent Aleven,et al.  Intelligent Tutoring Goes To School in the Big City , 1997 .

[11]  John R. Anderson,et al.  Skill Acquisition and the LISP Tutor , 1989, Cogn. Sci..

[12]  Franco Turini,et al.  Survey on using constraints in data mining , 2017, Data Mining and Knowledge Discovery.

[13]  Bogdan E. Popescu,et al.  PREDICTIVE LEARNING VIA RULE ENSEMBLES , 2008, 0811.1679.

[14]  Ivan Bratko,et al.  Argument based machine learning , 2006, Artif. Intell..

[15]  John R. Anderson,et al.  Cognitive Modeling and Intelligent Tutoring , 1990, Artif. Intell..

[16]  Antonija Mitrovic,et al.  Fifteen years of constraint-based tutors: what we have achieved and where we are going , 2011, User Modeling and User-Adapted Interaction.

[17]  Ivan Bratko,et al.  Automatic Extraction of AST Patterns for Debugging Student Programs , 2017, AIED.

[18]  Siegfried Nijssen,et al.  Pattern-Based Classification: A Unifying Perspective , 2011, ArXiv.

[19]  Raymund Sison,et al.  Automatic Construction of a Bug Library for Object-Oriented Novice Java Programmer Errors , 2008, Intelligent Tutoring Systems.

[20]  K. VanLehn The Relative Effectiveness of Human Tutoring, Intelligent Tutoring Systems, and Other Tutoring Systems , 2011 .

[21]  B. Bloom The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring , 1984 .

[22]  Herbert A. Simon,et al.  Applications of machine learning and rule induction , 1995, CACM.

[23]  Stellan Ohlsson,et al.  Constraint-Based Student Modeling , 1994 .

[24]  Johan Jeuring,et al.  Towards a Systematic Review of Automated Feedback Generation for Programming Exercises , 2016, ITiCSE.

[25]  Antonija Mitrovic,et al.  J-Latte: a Constraint-Based Tutor for Java , 2009 .

[26]  Pedro M. Domingos Toward knowledge-rich data mining , 2007, Data Mining and Knowledge Discovery.

[27]  J. Charles,et al.  A Sino-German λ 6 cm polarization survey of the Galactic plane I . Survey strategy and results for the first survey region , 2006 .

[28]  Roger Levy,et al.  Tregex and Tsurgeon: tools for querying and manipulating tree data structures , 2006, LREC.

[29]  Kenneth R. Koedinger,et al.  Toward a Rapid Development Environment for Cognitive Tutors , 2003 .

[30]  Nguyen-Thinh LEa,et al.  Evaluation of a Constraint-Based Homework Assistance System for Logic Programming , 2009 .

[31]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[32]  J. Folsom-Kovarik,et al.  Plan Ahead : Pricing ITS Learner Models , 2010 .

[33]  Desney S. Tan,et al.  Examining multiple potential models in end-user interactive concept learning , 2010, CHI.

[34]  Engelbert Mephu Nguifo,et al.  Learning task models in ill-defined domain using an hybrid knowledge discovery framework , 2011, Knowl. Based Syst..

[35]  Michael Eagle,et al.  Program Representation for Automatic Hint Generation for a Data-Driven Novice Programming Tutor , 2012, ITS.

[36]  Niels Pinkwart,et al.  Operationalizing the Continuum between Well-Defined and Ill-Defined Problems for Educational Technology , 2013, IEEE Transactions on Learning Technologies.

[37]  Neil T. Heffernan,et al.  Opening the Door to Non-programmers: Authoring Intelligent Tutor Behavior by Demonstration , 2004, Intelligent Tutoring Systems.

[38]  Ivan Bratko,et al.  Elicitation of neurological knowledge with argument-based machine learning , 2013, Artif. Intell. Medicine.

[39]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[40]  Johan Jeuring,et al.  Ask-Elle: an Adaptable Programming Tutor for Haskell Giving Automated Feedback , 2017, International Journal of Artificial Intelligence in Education.

[41]  Ivan Bratko,et al.  ABML Knowledge Refinement Loop: A Case Study , 2012, ISMIS.

[42]  Weng-Keen Wong,et al.  Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.

[43]  Antonija Mitrovic,et al.  Widening the Knowledge Acquisition Bottleneck for Constraint-based Tutors , 2010, Int. J. Artif. Intell. Educ..

[44]  Tom Murray,et al.  Authoring Intelligent Tutoring Systems: An analysis of the state of the art , 1999 .

[45]  Kenneth R. Koedinger,et al.  Data-Driven Hint Generation in Vast Solution Spaces: a Self-Improving Python Programming Tutor , 2015, International Journal of Artificial Intelligence in Education.

[46]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[47]  Martin Mozina,et al.  Orange: data mining toolbox in python , 2013, J. Mach. Learn. Res..

[48]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[49]  Lawrence B. Holder,et al.  Empirical comparison of graph classification algorithms , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[50]  Paul Voosen,et al.  The AI detectives. , 2017, Science.

[51]  Leonidas J. Guibas,et al.  Autonomously Generating Hints by Inferring Problem Solving Policies , 2015, L@S.