Is this Model for Real? Simulating Data to Reveal the Proximity of a Model to Reality

Simulated data plays a central role in Educational Data Mining and in particular in Bayesian Knowledge Tracing (BKT) research. The initial motivation for this paper was to try to answer the question: given two datasets could you tell which of them is real and which of them is simulated? The ability to answer this question may provide an additional indication of the goodness of the model, thus, if it is easy to discern simulated data from real data that could be an indication that the model does not provide an authentic representation of reality, whereas if it is hard to set the real and simulated data apart that might be an indication that the model is indeed authentic. In this paper we will describe analyses of 42 GLOP datasets that were performed in an attempt to address this question. Possible simulated data based metrics as well as additional findings that emerged during this exploration will be discussed.

[1]  Ryan Shaun Joazeiro de Baker,et al.  Contextual Slip and Prediction of Student Performance after Use of an Intelligent Tutor , 2010, UMAP.

[2]  Zachary A. Pardos,et al.  Is this Data for Real? , 2014, EDM.

[3]  R. Charles Murray,et al.  Reducing the Knowledge Tracing Space , 2009, EDM.

[4]  Uri Wilensky,et al.  GasLab—an Extensible Modeling Toolkit for Connecting Micro-and Macro-properties of Gases , 1999 .

[5]  Zachary A. Pardos,et al.  Navigating the parameter space of Bayesian Knowledge Tracing models: Visualizations of the convergence of the Expectation Maximization algorithm , 2010, EDM.

[6]  Vincent Aleven,et al.  More Accurate Student Modeling through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing , 2008, Intelligent Tutoring Systems.

[7]  Ryan S. Baker,et al.  The State of Educational Data Mining in 2009: A Review and Future Visions. , 2009, EDM 2009.

[8]  Zachary A. Pardos,et al.  Towards Moment of Learning Accuracy , 2013, AIED Workshops.

[9]  Zachary A. Pardos,et al.  Modeling Individualization in a Bayesian Networks Implementation of Knowledge Tracing , 2010, UMAP.

[10]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[11]  Michel C. Desmarais,et al.  On the Faithfulness of Simulated Student Performance Data , 2010, EDM.

[12]  Joseph E. Beck,et al.  Identifiability: A Fundamental Problem of Student Modeling , 2007, User Modeling.