Mining MOOC Clickstreams: Video-Watching Behavior vs. In-Video Quiz Performance

Student video-watching behavior and quiz performance are studied in two Massive Open Online Courses (MOOCs). In doing so, two frameworks are presented by which video-watching clickstreams can be represented: one based on the sequence of events created, and another on the sequence of positions visited. With the event-based framework, recurring subsequences of student behavior are extracted, which contain fundamental characteristics such as reflecting (i.e., repeatedly playing and pausing) and revising (i.e., plays and skip backs). It is found that some of these behaviors are significantly correlated with changes in the likelihood that a student will be Correct on First Attempt (CFA) or not in answering quiz questions, and in ways that are not necessarily intuitive. Then, with the position-based framework, models of quiz performance are devised based on positions visited in a video. In evaluating these models through CFA prediction, it is found that three of them can substantially improve prediction quality, which underlines the ability to relate this type of behavior to quiz scores. Since this prediction considers videos individually, these benefits also suggest that these models are useful in situations where there is limited training data, e.g., for early detection or in short courses.

[1]  Patrick C. Shih,et al.  Understanding Student Motivation, Behaviors and Perceptions in MOOCs , 2015, CSCW.

[2]  M. Tamer Özsu,et al.  A Web page prediction model based on click-stream tree representation of user behavior , 2003, KDD '03.

[3]  Richard G. Baraniuk,et al.  Time-varying learning and content analytics via sparse factor analysis , 2013, KDD.

[4]  Jie Xu,et al.  Predicting Grades , 2015, IEEE Transactions on Signal Processing.

[5]  Patrick Jermann,et al.  Your click decides your fate: Inferring Information Processing and Attrition Behavior from MOOC Video Clickstream Interactions , 2014, Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs.

[6]  Michael F. Schatz,et al.  Student Use of a Single Lecture Video in a Flipped Introductory Mechanics Course , 2015 .

[7]  Mung Chiang,et al.  Social learning networks: A brief survey , 2014, 2014 48th Annual Conference on Information Sciences and Systems (CISS).

[8]  Edin Osmanbegović,et al.  DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE , 2012 .

[9]  Krzysztof Z. Gajos,et al.  Understanding in-video dropouts and interaction peaks in online lecture videos Citation , 2014 .

[10]  Carolyn Penstein Rosé,et al.  “ Turn on , Tune in , Drop out ” : Anticipating student dropouts in Massive Open Online Courses , 2013 .

[11]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[12]  Chris Piech,et al.  Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[13]  Mung Chiang,et al.  MOOC performance prediction via clickstream data and social learning networks , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[14]  Doug Clow,et al.  MOOCs and the funnel of participation , 2013, LAK '13.

[15]  Armando Fox,et al.  Monitoring MOOCs: which information sources do instructors value? , 2014, L@S.

[16]  N. Heffernan,et al.  Using HMMs and bagged decision trees to leverage rich features of user and skill from an intelligent tutoring system dataset , 2010 .

[17]  Kenneth R. Koedinger,et al.  Individualized Bayesian Knowledge Tracing Models , 2013, AIED.

[18]  Michael Jahrer,et al.  Collaborative Filtering Applied to Educational Data Mining , 2010 .

[19]  Zhenming Liu,et al.  Learning about Social Learning in MOOCs: From Statistical Analysis to Generative Model , 2013, IEEE Transactions on Learning Technologies.

[20]  Gang Wang,et al.  Northeastern University , 2021, IEEE Pulse.

[21]  Patrick Jermann,et al.  MOOC Video Interaction Patterns: What Do They Tell Us? , 2015, EC-TEL.

[22]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[23]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[24]  Sangtae Ha,et al.  Individualization for Education at Scale: MIIC Design and Preliminary Evaluation , 2015, IEEE Transactions on Learning Technologies.

[25]  Zhenghao Chen,et al.  Tuned Models of Peer Assessment in MOOCs , 2013, EDM.

[26]  René F. Kizilcec,et al.  Motivation as a Lens to Understand Online Learners , 2015, ACM Trans. Comput. Hum. Interact..

[27]  Gianluca Antonini,et al.  On nested palindromes in clickstream data , 2012, KDD.

[28]  Yoav Bergner,et al.  Model-Based Collaborative Filtering Analysis of Student Response Data: Machine-Learning Item Response Theory , 2012, EDM.

[29]  Mikael Bodén,et al.  MEME Suite: tools for motif discovery and searching , 2009, Nucleic Acids Res..

[30]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[31]  Sebastián Ventura,et al.  Predicting students' final performance from participation in on-line discussion forums , 2013, Comput. Educ..