EdNet: A Large-Scale Hierarchical Dataset in Education

Advances in Artificial Intelligence in Education (AIEd) and the ever-growing scale of Interactive Educational Systems (IESs) have led to the rise of data-driven approaches for knowledge tracing and learning path recommendation. Unfortunately, collecting student interaction data is challenging and costly. As a result, there is no public large-scale benchmark dataset reflecting the wide variety of student behaviors observed in modern IESs. Although several datasets, such as ASSISTments, Junyi Academy, Synthetic and STATICS are publicly available and widely used, they are not large enough to leverage the full potential of state-of-the-art data-driven models. Furthermore, the recorded behavior is limited to question-solving activities. To this end, we introduce EdNet, a large-scale hierarchical dataset of diverse student activities collected by Santa, a multi-platform self-study solution equipped with an artificial intelligence tutoring system. EdNet contains 131,417,236 interactions from 784,309 students collected over more than 2 years, making it the largest public IES dataset released to date. Unlike existing datasets, EdNet records a wide variety of student actions ranging from question-solving to lecture consumption to item purchasing. Also, EdNet has a hierarchical structure which divides the student actions into 4 different levels of abstractions. The features of EdNet are domain-agnostic, allowing EdNet to be easily extended to different domains. The dataset is publicly released for research purposes. We plan to host challenges in multiple AIEd tasks with EdNet to provide a common ground for the fair comparison between different state-of-the-art models and to encourage the development of practical and effective methods.

[1]  Anca D. Dragan,et al.  Accelerating Human Learning with Deep Reinforcement Learning , 2017 .

[2]  Dit-Yan Yeung,et al.  Dynamic Key-Value Memory Networks for Knowledge Tracing , 2016, WWW.

[3]  Brett van de Sande,et al.  Properties of the Bayesian Knowledge Tracing Model , 2013, EDM 2013.

[4]  M. Schaar,et al.  Personalized Course Sequence Recommendations , 2015, IEEE Transactions on Signal Processing.

[5]  Peter Brusilovsky,et al.  Integrating Knowledge Tracing and Item Response Theory: A Tale of Two Frameworks , 2014, UMAP Workshops.

[6]  Kenneth R. Koedinger,et al.  A Data Repository for the EDM Community: The PSLC DataShop , 2010 .

[7]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[8]  Kenneth R. Koedinger,et al.  Individualized Bayesian Knowledge Tracing Models , 2013, AIED.

[9]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[10]  Byungsoo Kim,et al.  Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment , 2020, CSEDU.

[11]  Hui Xiong,et al.  EKT: Exercise-Aware Knowledge Tracing for Student Performance Prediction , 2019, IEEE Transactions on Knowledge and Data Engineering.

[12]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[13]  Zachary A. Pardos,et al.  Combating the Filter Bubble: Designing for Serendipity in a University Course Recommendation System , 2019, ArXiv.

[14]  Dragomir R. Radev,et al.  Creating A Neural Pedagogical Agent by Jointly Learning to Review and Assess , 2019, ArXiv.

[15]  Kangwook Lee,et al.  Learning analytics: Collaborative filtering or regression with experts? , 2016, NIPS 2016.

[16]  Byungsoo Kim,et al.  Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing , 2020, L@S.

[17]  Doaa Shawky,et al.  A Reinforcement Learning-Based Adaptive Learning System , 2018, AMLTA.

[18]  Neil T. Heffernan,et al.  Addressing the assessment challenge with an online system that tutors as it assesses , 2009, User Modeling and User-Adapted Interaction.

[19]  Enhong Chen,et al.  Exploring Multi-Objective Exercise Recommendations in Online Education Systems , 2019, CIKM.

[20]  Kuan-Ta Chen,et al.  Modeling Exercise Relationships in E-Learning: A Unified Approach , 2015, EDM.

[21]  Lars Schmidt-Thieme,et al.  Recommender system for predicting student performance , 2010, RecSysTEL@RecSys.

[22]  Travis Mandel Better Education through Improved Reinforcement Learning. , 2017 .

[23]  Ryan Shaun Joazeiro de Baker,et al.  Incorporating Scaffolding and Tutor Context into Bayesian Knowledge Tracing to Predict Inquiry Skill Acquisition , 2013, EDM.

[24]  George Karypis,et al.  A Self Attentive model for Knowledge Tracing , 2019, EDM.

[25]  Jonathan P. How,et al.  Reinforcement learning with multi-fidelity simulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Supreeth M. Gowda,et al.  Affective States and State Tests: Investigating How Affect and Engagement during the School Year Predict End-of-Year Learning Outcomes , 2014, J. Learn. Anal..

[27]  Enhong Chen,et al.  Exploiting Cognitive Structure for Adaptive Learning , 2019, KDD.

[28]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[29]  Byungsoo Kim,et al.  Assessment Modeling: Fundamental Pre-training Tasks for Interactive Educational Systems , 2020, ArXiv.

[30]  Tiffany Barnes,et al.  Hierarchical Reinforcement Learning for Pedagogical Policy Induction , 2019, AIED.

[31]  Qing Wang,et al.  Knowledge Tracing with Sequential Key-Value Memory Networks , 2019, SIGIR.

[32]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.