Knowledge Based and Intelligent Information and Engineering Systems , KES 2017 , 6-8 September 2017 , Marseilles , France Decision tree learning used for the classification of student archetypes in online courses

With ubiquitous Internet access nowadays, individuals have the ability to share more information than before, and it allows young people to collaborate and learn from a distance, so that educational systems are constantly being reshaped. Understanding eLearn-ing is important, and so is the typology of students who participate in this trend with increasing dedication. Yet, we consider that this accelerated pace of propagation of online education has left behind an important aspect needed for the act of teaching, namely studying and understanding student archetypes. By this we mean the common patterns which define the interaction type, dedication amount, and finalization perspective of courses. This paper introduces an original set of student profiles specific to online courses, and it does so by means of data mining and supervised learning. We use the responses from an online questionnaire to gather detailed opinion from 632 students from Romania regarding the advantages and disadvantages of MOOCs, as well as the reasons for not joining online courses. Based on the extracted statistics, we present six decision trees for classifying the finalization and participation rates of online courses based on the students individual traits. Furthermore, we discuss these profiles and explain the implications of this study. We believe our findings to bring consistent novelty both in understanding the needs of modern students, as well as in optimizing the way eLearning is further developed.

[1]  Shan Suthaharan,et al.  Machine Learning Models and Algorithms for Big Data Classification , 2016 .

[2]  Andreas Holzinger,et al.  Data Mining with Decision Trees: Theory and Applications , 2015, Online Inf. Rev..

[3]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[4]  J R Beck,et al.  Experiments to determine whether recursive partitioning (CART) or an artificial neural network overcomes theoretical limitations of Cox proportional hazards regression. , 1998, Computers and biomedical research, an international journal.

[5]  Luis de Marcos,et al.  An empirical study comparing gamification and social networking on e-learning , 2014, Comput. Educ..

[6]  Mohammed Erritali,et al.  A comparative study of decision tree ID3 and C4.5 , 2014 .

[7]  Sebastián Ventura,et al.  Data mining in education , 2013, WIREs Data Mining Knowl. Discov..

[8]  Thomas R. Guskey,et al.  Developing Grading and Reporting Systems for Student Learning , 2000 .

[9]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[10]  Shan Suthaharan,et al.  Decision Tree Learning , 2016 .

[11]  Shan Suthaharan,et al.  Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning , 2015 .

[12]  Juan Alfonso Lara,et al.  A system for knowledge discovery in e-learning environments within the European Higher Education Area - Application to student data from Open University of Madrid, UDIMA , 2014, Comput. Educ..

[13]  Stephen Shaoyi Liao,et al.  Student Profiling System for an Agent-Based Educational System , 2000 .

[14]  Andreas Bartschat,et al.  Data mining tools , 2019, WIREs Data Mining Knowl. Discov..

[15]  E F Cook,et al.  Empiric comparison of multivariate analytic techniques: advantages and disadvantages of recursive partitioning analysis. , 1984, Journal of chronic diseases.

[16]  Lennart E. Nacke,et al.  From game design elements to gamefulness: defining "gamification" , 2011, MindTrek.

[17]  Mike Moore,et al.  Distance Education: A Systems View , 1995 .

[18]  Byron Reeves,et al.  Total Engagement: Using Games and Virtual Worlds to Change the Way People Work and Businesses Compete , 2009 .

[19]  Francisco J. García-Peñalvo,et al.  Applied educational innovation MOOC: learners' experience and valorization of strengths and weaknesses , 2014, TEEM '14.

[20]  Mykola Pechenizkiy,et al.  Predicting Students Drop Out: A Case Study , 2009, EDM.

[21]  K. Werbach,et al.  For the Win: How Game Thinking Can Revolutionize Your Business , 2012 .

[22]  Ivar Bråten,et al.  Student Profiles of Knowledge and Epistemic Beliefs: Changes and Relations to Multiple-Text Comprehension. , 2013 .

[23]  N. V. Kalyankar,et al.  Drop Out Feature of Student Data for Academic Performance Using Decision Tree Techniques , 2010 .

[24]  Patricia A. Alexander,et al.  Profiling the Differences in Students' Knowledge, Interest, and Strategic Processing , 1998 .

[25]  V. Dennen,et al.  Instructor–Learner Interaction in Online Courses: The relative perceived importance of particular instructor actions on performance and satisfaction , 2007 .

[26]  Roberta F. White,et al.  Repeated split sample validation to assess logistic regression and recursive partitioning: an application to the prediction of cognitive impairment , 2005, Statistics in medicine.

[27]  Witold Pedrycz,et al.  Data Mining Methods for Knowledge Discovery , 1998, IEEE Trans. Neural Networks.

[28]  Robert Rosenthal,et al.  The Pygmalion Effect and its Mediating Mechanisms , 2002 .