Analyzing Learners Behavior in MOOCs: An Examination of Performance and Motivation Using a Data-Driven Approach

Massive open online courses (MOOCs) have been experiencing increasing use and popularity in highly ranked universities in recent years. The opportunity of accessing high quality courseware content within such platforms, while eliminating the burden of educational, financial, and geographical obstacles has led to a rapid growth in participant numbers. The increasing number and diversity of participating learners has opened up new horizons to the research community for the investigation of effective learning environments. Learning Analytics has been used to investigate the impact of engagement on student performance. However, the extensive literature review indicates that there is little research on the impact of MOOCs, particularly in analyzing the link between behavioral engagement and motivation as predictors of learning outcomes. In this paper, we consider a dataset, which originates from online courses provided by Harvard University and the Massachusetts Institute of Technology, delivered through the edX platform. Two sets of empirical experiments are conducted using both statistical and machine learning techniques. Statistical methods are used to examine the association between engagement level and performance, including the consideration of learner educational backgrounds. The results indicate a significant gap between success and failure outcome learner groups, where successful learners are found to read and watch course material to a higher degree. Machine learning algorithms are used to automatically detect learners who are lacking in motivation at an early time in the course, thus providing instructors with insight in regards to student withdrawal.

[1]  E. M. Rounds A combined nonparametric approach to feature selection and binary decision tree design , 1980, Pattern Recognit..

[2]  Jiming Liu,et al.  Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range , 2014, BMC Medical Research Methodology.

[3]  Maggie Hartnett,et al.  Examining motivation in online distance learning environments: Complex, multifaceted and situation-dependent , 2011 .

[4]  Trevor Hastie,et al.  Regularized linear discriminant analysis and its application in microarrays. , 2007, Biostatistics.

[5]  S. Wisniewski,et al.  Classification and Regression Tree (CART) analysis to predict influenza in primary care patients , 2016, BMC Infectious Diseases.

[6]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[7]  Martin Hlosta,et al.  OU Analyse: analysing at-risk students at The Open University , 2015 .

[8]  J. Osborne Improving your data transformations: Applying the Box-Cox transformation , 2010 .

[9]  Petra Perner,et al.  A comparison between neural networks and decision trees based on data from industrial radiographic testing , 2001, Pattern Recognit. Lett..

[10]  Sharad Singhal,et al.  Training Multilayer Perceptrons with the Extende Kalman Algorithm , 1988, NIPS.

[11]  Carolyn Penstein Rosé,et al.  Learning analytics and machine learning , 2014, LAK.

[12]  Doug Clow,et al.  The learning analytics cycle: closing the loop effectively , 2012, LAK.

[13]  Andy Laws,et al.  Towards the Differentiation of Initial and Final Retention in Massive Open Online Courses , 2017, ICIC.

[14]  E. Deci,et al.  Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.

[15]  Gerald J. Sussman,et al.  Teaching electronic circuits online: Lessons from MITx's 6.002x on edX , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[16]  Qian Zhang,et al.  Modeling and Predicting Learning Behavior in MOOCs , 2016, WSDM.

[17]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[18]  Dorian A. Canelas,et al.  Understanding the massive open online course (MOOC) student experience: An examination of attitudes, motivations, and barriers , 2017, Comput. Educ..

[19]  Liia Vilms,et al.  Introduction to computer science and programming , 1994 .

[20]  A. Satorra,et al.  A scaled difference chi-square test statistic for moment structure analysis , 1999 .

[21]  M. Cho,et al.  Self-regulated learning: the role of motivation, emotion, and use of learning strategies in students’ learning experiences in a self-paced online mathematics course , 2015 .

[22]  Paula de Barba,et al.  The role of students' motivation and participation in predicting performance in a MOOC , 2016, J. Comput. Assist. Learn..

[23]  Chris Piech,et al.  Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[24]  Vicki Trowler Student engagement literature review , 2010 .

[25]  A. Agresti An introduction to categorical data analysis , 1997 .

[26]  Linda Corrin,et al.  Visualizing patterns of student engagement and performance in MOOCs , 2014, LAK.

[27]  Lise Getoor,et al.  Modeling Learner Engagement in MOOCs using Probabilistic Soft Logic , 2013 .

[28]  Li Chen,et al.  A Nonlinear State Space Model for Identifying At-Risk Students in Open Online Courses , 2016, EDM.

[29]  Syed Abbas Ali,et al.  Analyzing undergraduate students' performance using educational data mining , 2017, Comput. Educ..

[30]  J V Tu,et al.  Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. , 1996, Journal of clinical epidemiology.

[31]  Jinan Fiaidhi,et al.  The Next Step for Learning Analytics , 2014, IT Prof..

[32]  Niels Pinkwart,et al.  Predicting MOOC Dropout over Weeks Using Machine Learning Methods , 2014, EMNLP 2014.

[33]  Joseph Jay Williams,et al.  HarvardX and MITx: Two Years of Open Online Courses Fall 2012-Summer 2014 , 2015 .

[34]  Jane Sinclair,et al.  Dropout rates of massive open online courses : behavioural patterns , 2014 .

[35]  Gita Taasoobshirazi,et al.  Science motivation questionnaire II: Validation with science majors and nonscience majors , 2011 .

[36]  David Martimort,et al.  Exclusive Dealing, Common Agency, and Multiprincipals Incentive Theory , 1996 .

[37]  Khe Foon Hew,et al.  Towards a Model of Engaging Online Students: Lessons from MOOCs and Four Policy Documents , 2015 .

[38]  Dorina Kabakchieva,et al.  Predicting Student Performance by Using Data Mining Methods for Classification , 2013 .

[39]  Khe Foon Hew,et al.  Promoting engagement in online courses: What strategies can we learn from three highly rated MOOCS , 2016, Br. J. Educ. Technol..

[40]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[41]  Brett D. Jones,et al.  Identification with Academics and Motivation to Achieve in School: How the Structure of the Self Influences Academic Outcomes , 2011 .

[42]  Hossam Haick,et al.  Motivation to learn in massive open online courses: Examining aspects of language and social engagement , 2016, Comput. Educ..

[43]  Justin Reich,et al.  HarvardX and MITx: The First Year of Open Online Courses, Fall 2012-Summer 2013 , 2014 .

[44]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.

[45]  Miha Vuk,et al.  ROC curve, lift chart and calibration plot , 2006, Advances in Methodology and Statistics.

[46]  Girish Balakrishnan,et al.  Predicting Student Retention in Massive Open Online Courses using Hidden Markov Models , 2013 .

[47]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[48]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[49]  Justin Reich,et al.  6.00x Introduction to Computer Science and Programming MITx on edX Course Report - 2012 Fall , 2014 .

[50]  Vili Podgorelec,et al.  Decision Trees: An Overview and Their Use in Medicine , 2002, Journal of Medical Systems.