The sum is greater than the parts: ensembling models of student knowledge in educational software

Many competing models have been proposed in the past decade for predicting student knowledge within educational software. Recent research attempted to combine these models in an effort to improve performance but have yielded inconsistent results. While work in the 2010 KDD Cup data set showed the benefits of ensemble methods, work in the Genetics Tutor failed to show similar benefits. We hypothesize that the key factor has been data set size. We explore the potential for improving student performance prediction with ensemble methods in a data set drawn from a different tutoring system, the ASSISTments Platform, which contains 15 times the number of responses of the Genetics Tutor data set. We evaluated the predictive performance of eight student models and eight methods of ensembling predictions. Within this data set, ensemble approaches were more effective than any single method with the best ensemble approach producing predictions of student performance 10% better than the best individual student knowledge model.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[3]  Pedro M. Domingos Why Does Bagging Work? A Bayesian Account and its Implications , 1997, KDD.

[4]  Kenneth R. Koedinger,et al.  Learning Factors Transfer Analysis: Using Learning Curve Analysis to Automatically Generate Domain Models , 2009, EDM.

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[7]  R. Hambleton,et al.  Fundamentals of Item Response Theory , 1991 .

[8]  ReyeJim Student Modelling Based on Belief Networks , 2004 .

[9]  Ingo Mierswa,et al.  YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.

[10]  Howard Wainer,et al.  Computerized Adaptive Testing: A Primer , 2000 .

[11]  Alfred Kobsa,et al.  The Adaptive Web, Methods and Strategies of Web Personalization , 2007, The Adaptive Web.

[12]  OpitzDavid,et al.  Popular ensemble methods , 1999 .

[13]  Zachary A. Pardos,et al.  Navigating the parameter space of Bayesian Knowledge Tracing models: Visualizations of the convergence of the Expectation Maximization algorithm , 2010, EDM.

[14]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[15]  Zachary A. Pardos,et al.  Ensembling predictions of student knowledge within intelligent tutoring systems , 2011, UMAP'11.

[16]  Neil T. Heffernan,et al.  Comparing Knowledge Tracing and Performance Factor Analysis by Using Multiple Model Fitting Procedures , 2010, Intelligent Tutoring Systems.

[17]  R. Sawyer The Cambridge Handbook of the Learning Sciences: Introduction , 2014 .

[18]  Kenneth R. Koedinger,et al.  Performance Factors Analysis - A New Alternative to Knowledge Tracing , 2009, AIED.

[19]  Ryan Shaun Joazeiro de Baker,et al.  Contextual Slip and Prediction of Student Performance after Use of an Intelligent Tutor , 2010, UMAP.

[20]  Martha C. Polson,et al.  Foundations of intelligent tutoring systems , 1988 .

[21]  Peter Brusilovsky,et al.  User Models for Adaptive Hypermedia and Adaptive Educational Systems , 2007, The Adaptive Web.

[22]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[23]  Vincent Aleven,et al.  More Accurate Student Modeling through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing , 2008, Intelligent Tutoring Systems.

[24]  N. Heffernan,et al.  Using HMMs and bagged decision trees to leverage rich features of user and skill from an intelligent tutoring system dataset , 2010 .

[25]  Neil T. Heffernan,et al.  A Comparison of Traditional Homework to Computer-Supported Homework , 2009 .

[26]  Ira P. Goldstein,et al.  The genetic graph: a representation for the evolution of procedural knowledge , 1979 .

[27]  Zachary A. Pardos,et al.  Less is More: Improving the Speed and Prediction Power of Knowledge Tracing by Using Less Data , 2011, EDM.

[28]  Yue Gong,et al.  Using Dirichlet priors to improve model parameter plausibility , 2009, EDM.

[29]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[30]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[31]  A. Corbett,et al.  The Cambridge Handbook of the Learning Sciences: Cognitive Tutors , 2005 .

[32]  Beverly Park Woolf,et al.  Student Modeling , 2010, Advances in Intelligent Tutoring Systems.

[33]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[34]  Kenneth R. Koedinger,et al.  Avoiding Problem Selection Thrashing with Conjunctive Knowledge Tracing , 2011, EDM.

[35]  Joel D. Martin,et al.  Student assessment using Bayesian nets , 1995, Int. J. Hum. Comput. Stud..

[36]  Shou-De Lin,et al.  Feature Engineering and Classifier Ensemble for KDD Cup 2010 , 2010, KDD 2010.

[37]  Jim Reye,et al.  Student Modelling Based on Belief Networks , 2004, Int. J. Artif. Intell. Educ..

[38]  Zachary A. Pardos,et al.  Modeling Individualization in a Bayesian Networks Implementation of Knowledge Tracing , 2010, UMAP.

[39]  Leena M. Razzaq,et al.  Developing Fine-Grained Transfer Models in the ASSISTment System , 2007 .

[40]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[41]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[42]  Albert T. Corbett,et al.  A Bayes Net Toolkit for Student Modeling in Intelligent Tutoring Systems , 2006, Intelligent Tutoring Systems.

[43]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.