Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods

Educational data mining has received considerable attention in the last few years. Many data mining techniques are proposed to extract the hidden knowledge from educational data. The extracted knowledge helps the institutions to improve their teaching methods and learning process. All these improvements lead to enhance the performance of the students and the overall educational outputs. In this paper, we propose a new student’s performance prediction model based on data mining techniques with new data attributes/features, which are called student’s behavioral features. These type of features are related to the learner’s interactivity with the e-learning management system. The performance of student’s predictive model is evaluated by set of classifiers, namely; Artificial Neural Network, Naive Bayesian and Decision tree. In addition, we applied ensemble methods to improve the performance of these classifiers. We used Bagging, Boosting and Random Forest (RF), which are the common ensemble methods used in the literature. The obtained results reveal that there is a strong relationship between learner’s behaviors and their academic achievement. The accuracy of the proposed model using behavioral features achieved up to 22.1% improvement comparing to the results when removing such features and it achieved up to 25.8% accuracy improvement using ensemble methods. By testing the model using newcomer students, the achieved accuracy is more than 80%. This result proves the reliability of the proposed model.

[1]  Ernestina Menasalvas Ruiz,et al.  Web Usage Mining Project for Improving Web-Based Learning Sites , 2005, EUROCAST.

[2]  Wahidah Husain,et al.  A Review on Predicting Student's Performance Using Data Mining Techniques , 2015 .

[3]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[4]  Sergio Rapuano,et al.  A Learning Management System Including Laboratory Experiments on Measurement Instrumentation , 2005, IMTC 2005.

[5]  Miguel Ángel Montero Alonso,et al.  Gender differences in e-learning satisfaction , 2012, Comput. Educ..

[6]  Chorng-Shyong Ong,et al.  Gender differences in perceptions and relationships among dominants of e-learning acceptance , 2006, Comput. Hum. Behav..

[7]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[8]  Sebastián Ventura,et al.  Data mining in course management systems: Moodle case study and tutorial , 2008, Comput. Educ..

[9]  Jamalul-lail Ab Manan,et al.  A neural network students' performance prediction model (NNSPPM) , 2013, 2013 IEEE International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA).

[10]  N. V. Kalyankar,et al.  Drop Out Feature of Student Data for Academic Performance Using Decision Tree Techniques , 2010 .

[11]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[12]  M. Hanna Data mining in the e‐learning domain , 2004 .

[13]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[14]  Selim Gunuc,et al.  Student engagement scale: development, reliability and validity1 , 2015 .

[15]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[16]  A. Karegowda,et al.  COMPARATIVE STUDY OF ATTRIBUTE SELECTION USING GAIN RATIO AND CORRELATION BASED FEATURE SELECTION , 2010 .

[17]  Suman,et al.  Comparative Analysis of Classification Algorithms on Different Datasets using WEKA , 2012 .

[18]  César Hervás-Martínez,et al.  Data Mining Algorithms to Classify Students , 2008, EDM.

[19]  M. Francesconi,et al.  Family Matters: Impacts of Family Background on Educational Attainments , 2001 .

[20]  Sheldon Rothman,et al.  School absence and student background factors: A multilevel analysis , 2001 .

[21]  P. Haddawy,et al.  A decision support system for evaluating international student applications , 2007, 2007 37th Annual Frontiers In Education Conference - Global Engineering: Knowledge Without Borders, Opportunities Without Passports.

[22]  E. Wong,et al.  EXPLORING THE ORIGINS AND INFORMATION PROCESSING DIFFERENCES BETWEEN MEN AND WOMEN: IMPLICATIONS FOR ADVERTISERS , 2010 .

[23]  Hugo R. Seibel,et al.  Personality Differences in Incoming Male and Female Medical Students. , 2004 .

[24]  Tsong Yueh Chen,et al.  On the statistical properties of the F-measure , 2004, Fourth International Conference onQuality Software, 2004. QSIC 2004. Proceedings..

[25]  Sime Arsenovski,et al.  Evaluating usability in learning management system moodle , 2008, ITI 2008 - 30th International Conference on Information Technology Interfaces.

[26]  Sebastián Ventura,et al.  A Survey on Pre-Processing Educational Data , 2014 .

[27]  George D. Kuh Assessing What Really Matters to Student Learning Inside The National Survey of Student Engagement , 2001 .

[28]  Violeta Moisa Adaptive Learning Management System , 2013 .

[29]  Ibrahim Aljarah,et al.  Preprocessing and analyzing educational data set using X-API for improving student's performance , 2015, 2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).