RKT: Relation-Aware Self-Attention for Knowledge Tracing

The world has transitioned into a new phase of online learning in response to the recent Covid19 pandemic. Now more than ever, it has become paramount to push the limits of online learning in every manner to keep flourishing the education system. One crucial component of online learning is Knowledge Tracing (KT). The aim of KT is to model student's knowledge level based on their answers to a sequence of exercises referred as interactions. Students acquire their skills while solving exercises and each such interaction has a distinct impact on student ability to solve a future exercise. This impact is characterized by 1) the relation between exercises involved in the interactions and 2) student forget behavior. Traditional studies on knowledge tracing do not explicitly model both the components jointly to estimate the impact of these interactions. In this paper, we propose a novel Relation-aware self-attention model for Knowledge Tracing (RKT). We introduce a relation-aware self-attention layer that incorporates the contextual information. This contextual information integrates both the exercise relation information through their textual content as well as student performance data and the forget behavior information through modeling an exponentially decaying kernel function. Extensive experiments on three real-world datasets, among which two new collections are released to the public, show that our model outperforms state-of-the-art knowledge tracing methods. Furthermore, the interpretable attention weights help visualize the relation between interactions and temporal patterns in the human learning process.

[1]  Dit-Yan Yeung,et al.  Dynamic Key-Value Memory Networks for Knowledge Tracing , 2016, WWW.

[2]  Enhong Chen,et al.  Finding Similar Exercises in Online Education Systems , 2018, KDD.

[3]  Yoon-Yeong Kim,et al.  Sequential Recommendation with Relation-Aware Kernelized Self-Attention , 2019, AAAI.

[4]  Radek Pelánek,et al.  Modeling Students' Memory for Application in Adaptive Educational Systems , 2015, EDM.

[5]  George Karypis,et al.  A Self Attentive model for Knowledge Tracing , 2019, EDM.

[6]  Zachary A. Pardos,et al.  Does Time Matter? Modeling the Effect of Time with Bayesian Knowledge Tracing , 2011, EDM.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  Tomoko Ohkuma,et al.  Augmenting Knowledge Tracing by Considering Forgetting Behavior , 2019, WWW.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Kuan-Ta Chen,et al.  Modeling Exercise Relationships in E-Learning: A Unified Approach , 2015, EDM.

[11]  Sanjeev Arora,et al.  A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.

[12]  Christine E. DeMars,et al.  Item Response Theory , 2010, Assessing Measurement Invariance for Applied Research.

[13]  Kenneth R. Koedinger,et al.  Performance Factors Analysis - A New Alternative to Knowledge Tracing , 2009, AIED.

[14]  Hui Xiong,et al.  EKT: Exercise-Aware Knowledge Tracing for Student Performance Prediction , 2019, IEEE Transactions on Knowledge and Data Engineering.

[15]  Lars Schmidt-Thieme,et al.  Factorization Models for Forecasting Student Performance , 2011, EDM.

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Ryan S. Baker,et al.  The State of Educational Data Mining in 2009: A Review and Future Visions. , 2009, EDM 2009.

[18]  Zachary A. Pardos,et al.  KT-IDEM: introducing item difficulty to the knowledge tracing model , 2011, UMAP'11.

[19]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[20]  Kenneth R. Koedinger,et al.  Learning Factors Analysis - A General Method for Cognitive Model Evaluation and Improvement , 2006, Intelligent Tutoring Systems.

[21]  Kenneth R. Koedinger,et al.  Individualized Bayesian Knowledge Tracing Models , 2013, AIED.

[22]  Enhong Chen,et al.  Exercise-Enhanced Sequential Modeling for Student Performance Prediction , 2018, AAAI.

[23]  Michael Jahrer,et al.  Collaborative Filtering Applied to Educational Data Mining , 2010 .

[24]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.

[25]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[26]  Qi Liu,et al.  Learning or Forgetting? A Dynamic Approach for Tracking the Knowledge Proficiency of Students , 2020, ACM Trans. Inf. Syst..

[27]  Tiffany Barnes,et al.  The Q-matrix Method: Mining Student Response Data for Knowledge , 2005 .

[28]  Penghe Chen,et al.  Prerequisite-Driven Deep Knowledge Tracing , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[29]  Lars Schmidt-Thieme,et al.  Recommender system for predicting student performance , 2010, RecSysTEL@RecSys.

[30]  Elizabeth J. Whitt,et al.  Student Success in College: Creating Conditions That Matter , 2012 .

[31]  Le Wu,et al.  Tracking Knowledge Proficiency of Students with Educational Priors , 2017, CIKM.

[32]  Hermann Ebbinghaus (1885) Memory: A Contribution to Experimental Psychology , 2013, Annals of Neurosciences.

[33]  Andrew Heathcote,et al.  The form of the forgetting curve and the fate of memories , 2011 .

[34]  Xing Wang,et al.  Context-Aware Self-Attention Networks , 2019, AAAI.

[35]  J. D. L. Torre,et al.  The Generalized DINA Model Framework. , 2011 .

[36]  Dit-Yan Yeung,et al.  Addressing two problems in deep knowledge tracing via prediction-consistent regularization , 2018, L@S.

[37]  Richard G. Baraniuk,et al.  Tag-Aware Ordinal Sparse Factor Analysis for Learning and Content Analytics , 2014, EDM.

[38]  Markus H. Gross,et al.  Beyond Knowledge Tracing: Modeling Skill Topologies with Bayesian Networks , 2014, Intelligent Tutoring Systems.

[39]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[40]  ChenEnhong,et al.  Learning or Forgetting? A Dynamic Approach for Tracking the Knowledge Proficiency of Students , 2020 .

[41]  Richard G. Baraniuk,et al.  Time-varying learning and content analytics via sparse factor analysis , 2013, KDD.