Learning What Works in ITS from Non-traditional Randomized Controlled Trial Data

The traditional, well established approach to finding out what works in education research is to run a randomized controlled trial (RCT) using a standard pretest and posttest design RCTs have been used in the intelligent tutoring community for decades to determine which questions and tutorial feedback work best Practically speaking, however, ITS creators need to make decisions on what content to deploy without the benefit of having run an RCT in advance Additionally, most log data produced by an ITS is not in a form that can easily be evaluated with traditional methods As a result, there is much data produced by tutoring systems that we would like to learn from but are not In prior work we introduced a potential solution to this problem: a Bayesian networks method that could analyze the log data of a tutoring system to determine which items were most effective for learning among a set of items of the same skill The method was validated by way of simulations In this work we further evaluate the method by applying it to real world data from 11 experiment datasets that investigate the effectiveness of various forms of tutorial help in a web based math tutoring system The goal of the method was to determine which questions and tutorial strategies cause the most learning We compared these results with a more traditional hypothesis testing analysis, adapted to our particular datasets We analyzed experiments in mastery learning problem sets as well as experiments in problem sets that, even though they were not planned RCTs, took on the standard RCT form We found that the tutorial help or item chosen by the Bayesian method as having the highest rate of learning agreed with the traditional analysis in 9 out of 11 of the experiments The practical impact of this work is an abundance of knowledge about what works that can now be learned from the thousands of experimental designs intrinsic in datasets of tutoring systems that assign items in a random order.

[1]  ReyeJim Student Modelling Based on Belief Networks , 2004 .

[2]  Zachary A. Pardos,et al.  Detecting the Learning Value of Items In a Randomized Problem Set , 2009, AIED.

[3]  Neil T. Heffernan,et al.  To Tutor or Not to Tutor: That is the Question , 2009, AIED.

[4]  Zachary A. Pardos,et al.  Determining the Significance of Item Order In Randomized Problem Sets , 2009, EDM.

[5]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[6]  Jack Mostow,et al.  Using Automated Within-Subject Invisible Experiments to Test the Effectiveness of Automated Vocabulary Assistance , 2001 .

[7]  Neil T. Heffernan,et al.  Tutored Problem Solving vs. “Pure” Worked Examples , 2009 .

[8]  Vincent Aleven,et al.  More Accurate Student Modeling through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing , 2008, Intelligent Tutoring Systems.

[9]  Zachary A. Pardos,et al.  Modeling Individualization in a Bayesian Networks Implementation of Knowledge Tracing , 2010, UMAP.

[10]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[11]  Neil T. Heffernan,et al.  Scaffolding vs. Hints in the Assistment System , 2006, Intelligent Tutoring Systems.

[12]  Arthur C. Graesser,et al.  When is Reading Just as Effective as One-on-One Interactive Human Tutoring? , 2005 .

[13]  Albert T. Corbett,et al.  Cognitive Computer Tutors: Solving the Two-Sigma Problem , 2001, User Modeling.

[14]  Zachary A. Pardos,et al.  KT-IDEM: introducing item difficulty to the knowledge tracing model , 2011, UMAP'11.

[15]  Joseph E. Beck,et al.  Macroadapting Animalwatch to Gender and Cognitive Differnces with Respect to Hint Interactivity and Symbolism , 2000, Intelligent Tutoring Systems.

[16]  Neil T. Heffernan,et al.  How to Construct More Accurate Student Models: Comparing and Optimizing Knowledge Tracing and Performance Factor Analysis , 2011, Int. J. Artif. Intell. Educ..