A machine learning-based methodology to predict learners' dropout, success or failure in MOOCs

Even if MOOCs (massive open online courses) are becoming a trend in distance learning, they suffer from a very high rate of learners’ dropout, and as a result, on average, only 10 per cent of enrolled learners manage to obtain their certificates of achievement. This paper aims to give tutors a clearer vision for an effective and personalized intervention as a solution to “retain” each type of learner at risk of dropping out.,This paper presents a methodology to provide predictions on learners’ behaviors. This work, which uses a Stanford data set, was divided into several phases, namely, a data extraction, an exploratory study and then a multivariate analysis to reduce dimensionality and to extract the most relevant features. The second step was the comparison between five machine learning algorithms. Finally, the authors used the principle of association rules to extract similarities between the behaviors of learners who dropped out from the MOOC.,The results of this work have given that deep learning ensures the best predictions in terms of accuracy, which is an average of 95.8 per cent, and is comparable to other measures such as precision, AUC, Recall and F1 score.,Many research studies have tried to tackle the MOOC dropout problem by proposing different dropout predictive models. In the same context, comes the present proposal with which the authors have tried to predict not only learners at a risk of dropping out of the MOOCs but also those who will succeed or fail.

[1]  Dit-Yan Yeung,et al.  Temporal Models for Predicting Student Dropout in Massive Open Online Courses , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[2]  Guoliang Chen,et al.  A fast algorithm for mining association rules , 2008, Journal of Computer Science and Technology.

[3]  Seyed Amir Naghibi,et al.  Application of Support Vector Machine, Random Forest, and Genetic Algorithm Optimized Random Forest Models in Groundwater Potential Mapping , 2017, Water Resources Management.

[4]  Sergio Luján-Mora,et al.  How Could MOOCs Become Accessible? The Case of edX and the Future of Inclusive Online Learning , 2016, J. Univers. Comput. Sci..

[5]  Reynold Xin,et al.  Introduction to Spark 2.0 for Database Researchers , 2016, SIGMOD Conference.

[6]  Wei Liu,et al.  A principle component analysis-based random forest with the potential nearest neighbor method for automobile insurance fraud identification , 2017, Appl. Soft Comput..

[7]  Josh Gardner,et al.  Student success prediction in MOOCs , 2017, User Modeling and User-Adapted Interaction.

[8]  Chao Li,et al.  Machine learning application in MOOCs: Dropout prediction , 2016, 2016 11th International Conference on Computer Science & Education (ICCSE).

[9]  Mar Pérez-Sanagustín,et al.  Self-regulated learning strategies predict learner behavior and goal attainment in Massive Open Online Courses , 2017, Comput. Educ..

[10]  Carlos Delgado Kloos,et al.  Prediction in MOOCs: A Review and Future Research Directions , 2019, IEEE Transactions on Learning Technologies.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Danielle S. McNamara,et al.  Combining click-stream data with NLP tools to better understand MOOC completion , 2016, LAK.

[13]  José M. Cecilia,et al.  Air-Pollution Prediction in Smart Cities through Machine Learning Methods: A Case of Study in Murcia, Spain , 2018, J. Univers. Comput. Sci..

[14]  Aurélien Garivier,et al.  On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[15]  Joseph K. Bradley,et al.  Spark SQL: Relational Data Processing in Spark , 2015, SIGMOD Conference.

[16]  Simon Cross,et al.  Evaluation of the OLDS MOOC curriculum design course: participant perspectives, expectations and experiences , 2013 .

[17]  Andy Laws,et al.  Machine learning approaches to predict learning outcomes in Massive open online courses , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[18]  Niels Pinkwart,et al.  Predicting MOOC Dropout over Weeks Using Machine Learning Methods , 2014, EMNLP 2014.

[19]  Juan Alfonso Lara,et al.  Data mining for modeling students' performance: A tutoring action plan to prevent academic dropout , 2017, Comput. Electr. Eng..

[20]  Li Yuan,et al.  MOOCs and open education: Implications for higher education , 2013 .

[21]  Xin Chen,et al.  Temporal predication of dropouts in MOOCs: Reaching the low hanging fruit through stacking generalization , 2016, Comput. Hum. Behav..

[22]  Mária Bieliková,et al.  Student behavior in a web-based educational system: Exit intent prediction , 2016, Eng. Appl. Artif. Intell..

[23]  Hua Li,et al.  Dropout prediction in MOOCs using behavior features and multi-view semi-supervised learning , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[24]  Hossam Haick,et al.  Motivation to learn in massive open online courses: Examining aspects of language and social engagement , 2016, Comput. Educ..

[25]  Xiu Li,et al.  Finding out Reasons for Low Completion in MOOC Environment: An Explicable Approach Using Hybrid Data Mining Methods , 2017 .

[26]  Chi-Wen Kuo,et al.  Adapting an Evidence-based Diagnostic Model for Predicting Recurrence Risk Factors of Oral Cancer , 2018, J. Univers. Comput. Sci..

[27]  Wanli Xing,et al.  Dropout Prediction in MOOCs: Using Deep Learning for Personalized Intervention , 2019 .

[28]  Xin Chen,et al.  Corrigendum to "Temporal predication of dropouts in MOOCs: Reaching the low hanging fruit through stacking generalization" [Computers in Human Behavior 58 (2016) 119-129] , 2017, Comput. Hum. Behav..

[29]  Ghada R. El Said,et al.  Exploring the factors affecting MOOC retention: A survey study , 2016, Comput. Educ..