The utilization of data analysis techniques in predicting student performance in massive open online courses (MOOCs)

The growth of the Internet has enabled the popularity of open online learning platforms to increase over the years. This has led to the inception of Massive Open Online Courses (MOOCs) that globally enrol millions of people. Such courses operate under the concept of open learning, where content does not have to be delivered via standard mechanisms that institutions employ, such as physically attending lectures. Instead learning occurs online via recorded lecture material and online tasks. This shift has allowed more people to gain access to education, regardless of their learning background. However, despite these advancements, completion rates for MOOCs are low. The paper presents our approach to learner predication in MOOCs by exploring the impact that technology has on open learning and identifies how data about student performance can be captured to predict trend so that at risk students can be identified before they drop-out. The study we have undertaken uses the eRegister system, which has been developed to capture and analyze data. The results indicate that high/active engagement, interaction and attendance is reflective of higher marks. Additonally, our approach is able to normalize the data into consistent a series so that the end result can be transformed into a dashboard of statistics that can be used by organizers of the MOOC. Based on this, we conclude that there is a fundamental need for predictive systems within learning communities.

[1]  Keshab Nath,et al.  Web 1.0 to Web 3.0 - Evolution of the Web and its various challenges , 2014, 2014 International Conference on Reliability Optimization and Information Technology (ICROIT).

[2]  Felix Naumann,et al.  Data fusion , 2009, CSUR.

[3]  J. Beck,et al.  An Educational Data Mining Tool to Browse Tutor-Student Interactions : Time Will Tell ! , 2005 .

[4]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[5]  M. Hanna Data mining in the e‐learning domain , 2004 .

[6]  Jenny Muir,et al.  Student Attendance: Is It Important, and What Do Students Think? , 2009 .

[7]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[8]  H. Fournier,et al.  New dimensions to self-directed learning in an open networked learning environment , 2012 .

[9]  Quan Z. Sheng,et al.  The ethical and social implications of personalization technologies for e-learning , 2014, Inf. Manag..

[10]  Marc Clarà,et al.  Learning online: massive open online courses (MOOCs), connectivism, and cultural psychology , 2013 .

[11]  Valerie Irvine,et al.  Visualizing Learning Analytics: Designing A Roadmap For Success , 2013 .

[12]  T. Subba Rao,et al.  Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB , 2004 .

[13]  Ormond Simpson,et al.  Predicting student success in open and distance learning , 2006 .

[14]  C. Osvaldo Rodriguez,et al.  What Tweets Tell us About MOOC Participation , 2014, iJET.

[15]  Jen Ross,et al.  The pedagogy of the Massive Open Online Course (MOOC): the UK view , 2014 .

[16]  Peter Duffy,et al.  Engaging the YouTube Google-Eyed Generation: Strategies for Using Web 2.0 in Teaching and Learning. , 2008 .

[17]  J. Alberto Espinosa,et al.  Big Data: Issues and Challenges Moving Forward , 2013, 2013 46th Hawaii International Conference on System Sciences.

[18]  Nafees Ur Rehman,et al.  Discovering OLAP dimensions in semi-structured data , 2012, DOLAP '12.

[19]  Lise Getoor,et al.  Modeling Learner Engagement in MOOCs using Probabilistic Soft Logic , 2013 .

[20]  Àngela Nebot,et al.  Applying Data Mining Techniques to e-Learning Problems , 2007 .

[21]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[22]  Katy Jordan,et al.  Initial trends in enrolment and completion of massive open online courses , 2014 .

[23]  Daniel E. O'Leary,et al.  Artificial Intelligence and Big Data , 2013, IEEE Intelligent Systems.

[24]  Marc Alier Forment,et al.  Google analytics for time behavior measurement in Moodle , 2014, 2014 9th Iberian Conference on Information Systems and Technologies (CISTI).

[25]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[26]  Sebastián Ventura,et al.  Web usage mining for predicting final marks of students that use Moodle courses , 2013, Comput. Appl. Eng. Educ..

[27]  Barbara Wixom,et al.  The Current State of Business Intelligence , 2007, Computer.

[28]  Antoine Doucet,et al.  Building engagement for MOOC students: introducing support for time management on online learning platforms , 2014, WWW.

[29]  Niels Pinkwart,et al.  Predicting MOOC Dropout over Weeks Using Machine Learning Methods , 2014, EMNLP 2014.

[30]  Lise Getoor,et al.  Uncovering hidden engagement patterns for predicting learner performance in MOOCs , 2014, L@S.

[31]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[32]  Carolyn Penstein Rosé,et al.  Sentiment Analysis in MOOC Discussion Forums: What does it tell us? , 2014, EDM.

[33]  George D. Kuh,et al.  Student Engagement and Student Learning: Testing the Linkages* , 2006 .

[34]  Girish Balakrishnan,et al.  Predicting Student Retention in Massive Open Online Courses using Hidden Markov Models , 2013 .

[35]  David G. Stork,et al.  Pattern Classification , 1973 .

[36]  Weiguo Fan,et al.  The power of social media analytics , 2014, CACM.

[37]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[38]  Matteo Golfarelli,et al.  Beyond data warehousing: what's next in business intelligence? , 2004, DOLAP '04.

[39]  Rita Kop,et al.  The Challenges to Connectivist Learning on Open Online Networks: Learning Experiences during a Massive Open Online Course , 2011 .

[40]  Allison Littlejohn,et al.  Merlot Journal of Online Learning and Teaching Patterns of Engagement in Connectivist Moocs , 2022 .

[41]  M. Anusha,et al.  Big Data-Survey , 2016 .

[42]  Chris Piech,et al.  Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[43]  Roberta Paroli,et al.  Parameter estimation of Gaussian hidden Markov models when missing observations occur , 2002 .

[44]  Mark Warschauer,et al.  Predicting MOOC performance with Week 1 Behavior , 2014, EDM.

[45]  Kristy Elizabeth Boyer,et al.  Unsupervised modeling for understanding MOOC discussion forums: a learning analytics approach , 2015, LAK.

[46]  Christophe G. Giraud-Carrier,et al.  Characterising Data Mining software , 2003, Intell. Data Anal..

[47]  Doug Clow,et al.  MOOCs and the funnel of participation , 2013, LAK '13.

[48]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..