Your Model Is Predictive - but Is It Useful? Theoretical and Empirical Considerations of a New Paradigm for Adaptive Tutoring Evaluation

Classification evaluation metrics are often used to evaluate adaptive tutoring systems— programs that teach and adapt to humans. Unfortunately, it is not clear how intuitive these metrics are for practitioners with little machine learning background. Moreover, our experiments suggest that existing convention for evaluating tutoring systems may lead to suboptimal decisions. We propose the Learner Effort-Outcomes Paradigm (Leopard), a new framework to evaluate adaptive tutoring. We introduce Teal and White, novel automatic metrics that apply Leopard and quantify the amount of effort required to achieve a learning outcome. Our experiments suggest that our metrics are a better alternative for evaluating adaptive tutoring.

[1]  Peter Brusilovsky,et al.  General Features in Knowledge Tracing: Applications to Multiple Subskills, Temporal Item Response Theory, and Expert Knowledge , 2014 .

[2]  References , 1971 .

[3]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[4]  Marilyn A. Walker,et al.  Towards developing general models of usability with PARADISE , 2000, Natural Language Engineering.

[5]  Brett van de Sande,et al.  Properties of the Bayesian Knowledge Tracing Model , 2013, EDM 2013.

[6]  José P. González-Brenes,et al.  Using Data from Real and Simulated Learners to Evaluate Adaptive Tutoring Systems , 2015, AIED Workshops.

[7]  Ilya M. Goldin,et al.  Viz-R: Using Recency to Improve Student and Domain Models , 2015, L@S.

[8]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[9]  Rohit Kumar,et al.  A Framework for Multifaceted Evaluation of Student Models , 2015, EDM.

[10]  Vincent Aleven,et al.  More Accurate Student Modeling through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing , 2008, Intelligent Tutoring Systems.

[11]  Kenneth R. Koedinger,et al.  Is Over Practice Necessary? - Improving Learning Efficiency with the Cognitive Tutor through Educational Data Mining , 2007, AIED.

[12]  Eduard Hovy,et al.  Manual and automatic evaluation of summaries , 2002, ACL 2002.

[13]  Kenneth R. Koedinger,et al.  Performance Factors Analysis - A New Alternative to Knowledge Tracing , 2009, AIED.

[14]  Radek Pelánek A Brief Overview of Metrics for Evaluation of Student Models , 2014, EDM.

[15]  Zachary A. Pardos,et al.  A Comparison of Error Metrics for Learning Model Parameters in Bayesian Knowledge Tracing , 2014, EDM.

[16]  Zachary A. Pardos,et al.  Towards Moment of Learning Accuracy , 2013, AIED Workshops.

[17]  Emma Brunskill,et al.  The Impact on Individualizing Student Models on Necessary Practice Opportunities , 2012, EDM.

[18]  Joseph E. Beck,et al.  Limits to accuracy: how well can we do at student modeling? , 2013, EDM.

[19]  Zoran Popovic,et al.  Towards automatic experimentation of educational knowledge , 2014, CHI.

[20]  Yue Gong,et al.  Using Dirichlet priors to improve model parameter plausibility , 2009, EDM.

[21]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[22]  José P. González-Brenes Modeling Skill Acquisition Over Time with Sequence and Topic Modeling , 2015, AISTATS.