Identifying At-Risk Students in Massive Open Online Courses

Massive Open Online Courses (MOOCs) have received widespread attention for their potential to scale higher education, with multiple platforms such as Coursera, edX and Udacity recently appearing. Despite their successes, a major problem faced by MOOCs is low completion rates. In this paper, we explore the accurate early identification of students who are at risk of not completing courses. We build predictive models weekly, over multiple offerings of a course. Furthermore, we envision student interventions that present meaningful probabilities of failure, enacted only for marginal students. To be effective, predicted probabilities must be both well-calibrated and smoothed across weeks. Based on logistic regression, we propose two transfer learning algorithms to trade-off smoothness and accuracy by adding a regularization term to minimize the difference of failure probabilities between consecutive weeks. Experimental results on two offerings of a Coursera MOOC establish the effectiveness of our algorithms.

[1]  Lise Getoor,et al.  Modeling Learner Engagement in MOOCs using Probabilistic Soft Logic , 2013 .

[2]  Carolyn Penstein Rosé,et al.  Sentiment Analysis in MOOC Discussion Forums: What does it tell us? , 2014, EDM.

[3]  Girish Balakrishnan,et al.  Predicting Student Retention in Massive Open Online Courses using Hidden Markov Models , 2013 .

[4]  Kalyan Veeramachaneni,et al.  Towards Feature Engineering at Scale for Data from Massive Open Online Courses , 2014, ArXiv.

[5]  Kalyan Veeramachaneni,et al.  Likely to stop? Predicting Stopout in Massive Open Online Courses , 2014, ArXiv.

[6]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[7]  Leon Wenliang Zhong,et al.  Accurate Probability Calibration for Multiple Classifiers , 2013, IJCAI.

[8]  David E. Pritchard,et al.  Bringing student backgrounds online: MOOC user demographics, site usage, and online learning , 2013, EDM.

[9]  Radia J. Perlman,et al.  Network security - private communication in a public world , 2002, Prentice Hall series in computer networking and distributed systems.

[10]  Jason M. Lodge What if student attrition was treated like an illness? An epidemiological model for learning analytics , 2011 .

[11]  Niels Pinkwart,et al.  Predicting MOOC Dropout over Weeks Using Machine Learning Methods , 2014, EMNLP 2014.

[12]  A. Karimi,et al.  Master‟s thesis , 2011 .

[13]  Rich Caruana,et al.  Predicting good probabilities with supervised learning , 2005, ICML.

[14]  Carolyn Penstein Rosé,et al.  Peer Influence on Attrition in Massively Open Online Courses , 2014, EDM.

[15]  Carolyn Penstein Rosé,et al.  “ Turn on , Tune in , Drop out ” : Anticipating student dropouts in Massive Open Online Courses , 2013 .

[16]  Rich Caruana,et al.  Obtaining Calibrated Probabilities from Boosting , 2005, UAI.

[17]  Sherif A. Halawa,et al.  Dropout Prediction in MOOCs using Learner Activity Features , 2014 .

[18]  Lise Getoor,et al.  Learning Latent Engagement Patterns of Students in Online Courses , 2014, AAAI.

[19]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[20]  Mark Warschauer,et al.  Predicting MOOC performance with Week 1 Behavior , 2014, EDM.