Improving final grade prediction accuracy in blended learning environment using voting ensembles

This paper deals with the comparative analysis of prediction classifiers in the blended learning environment. The model proposed in this paper predicts students’ final grades based on activities within different educational environments. A comparative study of classifier performance has been performed in order to determine the classifier most suitable for multiclass feature dataset. Important results for different classes have been obtained using different classifiers, and the majority vote scheme is subsequentially used to form an ensemble based on Naïve Bayes, Hidden Naïve Bayes, J48 decision tree and Random Forest. According to experimental evaluation, there is a significant improvement of proposed model's precision and accuracy regarding the students’ grades prediction in blended learning environment scenario. The major contribution of the research presented in this paper is an efficient multi‐class prediction model applicable to aforementioned environment.

[1]  Vasile Paul Bresfelean,et al.  Determining students’ academic failure profile founded on data mining methods , 2008, ITI 2008 - 30th International Conference on Information Technology Interfaces.

[2]  Martin Muehlenbrock Automatic Action Analysis in an Interactive Learning Environment , 2005 .

[3]  Ryan S. Baker,et al.  The State of Educational Data Mining in 2009: A Review and Future Visions. , 2009, EDM 2009.

[4]  Dimitrios Kalles,et al.  ANALYZING STUDENT PERFORMANCE IN DISTANCE LEARNING WITH GENETIC ALGORITHMS AND DECISION TREES , 2006, Appl. Artif. Intell..

[5]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[6]  Alejandro Peña Ayala,et al.  Educational data mining: A survey and a data mining-based analysis of recent works , 2014, Expert Syst. Appl..

[7]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[8]  W. F. Punch,et al.  Predicting student performance: an application of data mining methods with an educational Web-based system , 2003, 33rd Annual Frontiers in Education, 2003. FIE 2003..

[9]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[10]  I. Arroyo,et al.  Bayesian networks and linear regression models of students’ goals, moods, and emotions , 2010 .

[11]  Sotiris B. Kotsiantis,et al.  PREDICTING STUDENTS' PERFORMANCE IN DISTANCE LEARNING USING MACHINE LEARNING TECHNIQUES , 2004, Appl. Artif. Intell..

[12]  William F. Punch,et al.  Enhancing Online Learning Performance: An Application of Data Mining Methods , 2004, CATE.

[13]  Christopher M. Bishop,et al.  Classification and regression , 1997 .

[14]  Fabio Roli,et al.  Dynamic Classifier Selection , 2000, Multiple Classifier Systems.

[15]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[16]  Mihaela Cocea,et al.  Cross-System Validation of Engagement Prediction from Log Files , 2007, EC-TEL.

[17]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[18]  Bogdan Gabrys,et al.  Analysis of the Correlation Between Majority Voting Error and the Diversity Measures in Multiple Classifier Systems , 2001 .

[19]  Robert P. W. Duin,et al.  Learned from Neural Networks , 2000 .

[20]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[21]  Wilhelmiina Hämäläinen,et al.  Classifiers for educational data mining , 2010 .

[22]  Lawrence O. Hall,et al.  Distributed Learning on Very Large Data Sets , 2000 .

[23]  Mykola Pechenizkiy,et al.  Handbook of Educational Data Mining , 2010 .

[24]  Ethem Alpaydin,et al.  Introduction to Machine Learning (Adaptive Computation and Machine Learning) , 2004 .

[25]  KohaviRon,et al.  An Empirical Comparison of Voting Classification Algorithms , 1999 .

[26]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[27]  Marta E. Zorrilla,et al.  Comparing classification methods for predicting distance students' performance , 2011, WAPA.

[28]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[29]  Baldoino Fonseca dos Santos Neto,et al.  Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses , 2017, Comput. Hum. Behav..

[30]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[31]  Fabio Roli,et al.  A theoretical framework for dynamic classifier selection , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[32]  Alejandro Peña-Ayala Review: Educational data mining: A survey and a data mining-based analysis of recent works , 2014 .

[33]  Giorgio Valentini,et al.  Ensembles of Learning Machines , 2002, WIRN.

[34]  Fuzong Lin,et al.  Investigation of Web-based teaching and learning by boosting algorithms , 2003, International Conference on Information Technology: Research and Education, 2003. Proceedings. ITRE2003..

[35]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[36]  Lars Schmidt-Thieme,et al.  Improving Academic Performance Prediction by Dealing with Class Imbalance , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[37]  B CostaEvandro,et al.  Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses , 2017 .

[38]  W. Feller,et al.  An Introduction to Probability Theory and Its Application. , 1951 .

[39]  Wilhelmiina Hämäläinen,et al.  Comparison of Machine Learning Methods for Intelligent Tutoring Systems , 2006, Intelligent Tutoring Systems.

[40]  Bernardete Ribeiro,et al.  Behavior Pattern Mining during the Evaluation Phase in an e-Learning Course , 2007 .

[41]  Zachary A. Pardos,et al.  The Effect of Model Granularity on Student Performance Prediction Using Bayesian Networks , 2007, User Modeling.

[42]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[43]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[44]  T. Martinez,et al.  Estimating The Potential for Combining Learning Models , 2005 .

[45]  César Hervás-Martínez,et al.  Data Mining Algorithms to Classify Students , 2008, EDM.

[46]  Sotiris B. Kotsiantis,et al.  Preventing Student Dropout in Distance Learning Using Machine Learning Techniques , 2003, KES.

[47]  Sotiris B. Kotsiantis,et al.  Local voting of weak classifiers , 2005, Int. J. Knowl. Based Intell. Eng. Syst..

[48]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[50]  Qing Zhou,et al.  Classification Algorithms to Predict Students' Extraversion-Introversion Traits , 2016, 2016 International Conference on Cyberworlds (CW).

[51]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.