Knowledge Tracing is perhaps the most widely used student model in the field of educational data mining. In this paper we report on the effects of using only a subset of data in training the Bayesian Network that represents this student model. The standard practice is to use all of the students’ data for a given skill to fit the model. We analyze two datasets; one from the Algebra Cognitive tutor and the other from the Genetics Cognitive tutor. We found that in both datasets, the difference in accuracy between using all the students' data versus only the most recent 15 data points of each student was not significantly different. Using only 15 responses however, resulted in an EM training time which was 15 times faster than using all data. This result suggests that the Knowledge Tracing model needs only a small range of data in order to learn reliable parameters. The implications of this result is a substantial savings in model training time that allows for more complex models to be fit or individualized models to be trained online.
[1]
Zachary A. Pardos,et al.
Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network
,
2008
.
[2]
John R. Anderson,et al.
Knowledge tracing: Modeling the acquisition of procedural knowledge
,
2005,
User Modeling and User-Adapted Interaction.
[3]
R. Charles Murray,et al.
Reducing the Knowledge Tracing Space
,
2009,
EDM.
[4]
Albert T. Corbett,et al.
A Cognitive Tutor for Genetics Problem Solving: Learning Gains and Student Modeling
,
2010
.
[5]
Zachary A. Pardos,et al.
Ensembling predictions of student knowledge within intelligent tutoring systems
,
2011,
UMAP'11.
[6]
N. Heffernan,et al.
Using HMMs and bagged decision trees to leverage rich features of user and skill from an intelligent tutoring system dataset
,
2010
.