Predicting student academic performance using multi-model heterogeneous ensemble approach

The purpose of this paper is to empirically investigate and compare the use of multiple data sources, different classifiers and ensembles of classifiers technique in predicting student academic performance. The study will compare the performance and efficiency of ensemble techniques that make use of different combination of data sources with that of base classifiers with single data source.,Using a quantitative research methodology, data samples of 141 learners enrolled in the University of the West of Scotland were extracted from the institution’s databases and also collected through survey questionnaire. The research focused on three data sources: student record system, learning management system and survey, and also used three state-of-art data mining classifiers, namely, decision tree, artificial neural network and support vector machine for the modeling. In addition, the ensembles of these base classifiers were used in the student performance prediction and the performances of the seven different models developed were compared using six different evaluation metrics.,The results show that the approach of using multiple data sources along with heterogeneous ensemble techniques is very efficient and accurate in prediction of student performance as well as help in proper identification of student at risk of attrition.,The approach proposed in this study will help the educational administrators and policy makers working within educational sector in the development of new policies and curriculum on higher education that are relevant to student retention. In addition, the general implications of this research to practice is its ability to accurately help in early identification of students at risk of dropping out of HE from the combination of data sources so that necessary support and intervention can be provided.,The research empirically investigated and compared the performance accuracy and efficiency of single classifiers and ensemble of classifiers that make use of single and multiple data sources. The study has developed a novel hybrid model that can be used for predicting student performance that is high in accuracy and efficient in performance. Generally, this research study advances the understanding of the application of ensemble techniques to predicting student performance using learner data and has successfully addressed these fundamental questions: What combination of variables will accurately predict student academic performance? What is the potential of the use of stacking ensemble techniques in accurately predicting student academic performance?

[1]  Sebastián Ventura,et al.  Predicting students' final performance from participation in on-line discussion forums , 2013, Comput. Educ..

[2]  Rianne Conijn,et al.  Predicting Student Performance from LMS Data: A Comparison of 17 Blended Courses Using Moodle LMS , 2017, IEEE Transactions on Learning Technologies.

[3]  Stefanos Gritzalis,et al.  Improving Quality of Educational Processes Providing New Knowledge using Data Mining Techniques , 2014 .

[4]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[5]  GuoRui,et al.  Participation-based student final performance prediction model through interpretable Genetic Programming , 2015 .

[6]  Olugbenga Adejo,et al.  An integrated system framework for predicting students' academic performance in higher educational institutions , 2017 .

[7]  E. M.Badr,et al.  Some Computational Results on MPI Parallel Implementation of Derived Subgraph Algorithm , 2012 .

[8]  V. Ramesh,et al.  Predicting Student Performance: A Statistical and Data Mining Approach , 2013 .

[9]  Geoffrey I. Webb,et al.  Multistrategy ensemble learning: reducing error by combining ensemble learning techniques , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Teknik Informatika,et al.  PREDICTION OF STUDENT ACADEMIC PERFORMANCE BY AN APPLICATION OF DATA MINING TECHNIQUES , 2011 .

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Miguel Ángel Conde González,et al.  Can we predict success from log data in VLEs? Classification of interactions for learning analytics and their relation with performance in VLE-supported F2F and online learning , 2014, Comput. Hum. Behav..

[13]  William G. Spady,et al.  Dropouts from higher education: An interdisciplinary review and synthesis , 1970 .

[14]  W. F. Punch,et al.  Predicting student performance: an application of data mining methods with an educational Web-based system , 2003, 33rd Annual Frontiers in Education, 2003. FIE 2003..

[15]  Bruno Trstenjak,et al.  Determining the impact of demographic features in predicting student success in Croatia , 2014, 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[16]  Lubomír Popelínský,et al.  Predicting Student Performance in Higher Education , 2013, 2013 24th International Workshop on Database and Expert Systems Applications.

[17]  Bikram Sengupta,et al.  On early prediction of risks in academic performance for students , 2015, IBM J. Res. Dev..

[18]  Meehyun Yoon,et al.  Analyzing the log patterns of adult learners in LMS using learning analytics , 2014, LAK.

[19]  Nyein Aye,et al.  Automatic Facial expression Recognition System using Orientation Histogram and Neural Network , 2013 .

[20]  G. Gray,et al.  Non-Cognitive Factors of Learning as Early Indicators of Students at-Risk of Failing in Tertiary Education , 2016 .

[21]  Mohd Sharifuddin Ahmad,et al.  Analyzing students records to identify patterns of students' performance , 2013, 2013 International Conference on Research and Innovation in Information Systems (ICRIIS).

[22]  Steven Finlay,et al.  Predictive Analytics, Data Mining and Big Data , 2014 .

[23]  Hugh C. Davis,et al.  Exploring student predictive model that relies on institutional databases and open data instead of traditional questionnaires , 2013, WWW.

[24]  P. Anitha,et al.  Efficient classification mechanism for network intrusion detection system based on data mining techniques: A survey , 2014, 2014 IEEE 8th International Conference on Intelligent Systems and Control (ISCO).

[25]  Steven Finlay,et al.  Predictive Analytics, Data Mining and Big Data: Myths, Misconceptions and Methods , 2014 .

[26]  Sotiris B. Kotsiantis,et al.  Predicting students marks in Hellenic Open University , 2005, Fifth IEEE International Conference on Advanced Learning Technologies (ICALT'05).

[27]  Tabe Bordbar Fariba,et al.  Academic Performance of Virtual Students based on their Personality Traits, Learning Styles and Psychological Well Being: A Prediction☆ , 2013 .

[29]  Moti Zwilling,et al.  Student data mining solution-knowledge management system related to higher education institutions , 2014, Expert Syst. Appl..

[30]  Dorina Kabakchieva,et al.  Predicting Student Performance by Using Data Mining Methods for Classification , 2013 .

[31]  Chong Kim Loy,et al.  A Study on Predicting Undergraduate's Improvement of Academic Performances based on their Characteristics of Learning and Approaches at a Private Higher Educational Institution☆ , 2013 .

[32]  Predicting Academic Success from Student Enrolment Data using Decision Tree Technique , 2012 .

[33]  V. O. Oladokun,et al.  Predicting Students' Academic Performance using Artificial Neural Network: A Case Study of an Engineering Course. , 2008 .

[34]  S. Taruna,et al.  An empirical analysis of classification techniques for predicting academic performance , 2014, 2014 IEEE International Advance Computing Conference (IACC).

[35]  María del Puerto Paule Ruíz,et al.  Students' LMS interaction patterns and their relationship with achievement: A case study in higher education , 2016, Comput. Educ..

[36]  F. Dochy,et al.  Predicting academic performance: The role of cognition, motivation and learning approaches. A neural network analysis , 2015 .

[37]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[38]  Vahida Attar,et al.  Prediction of gold and silver stock price using ensemble models , 2014, 2014 International Conference on Advances in Engineering & Technology Research (ICAETR - 2014).

[39]  Rui Guo,et al.  Participation-based student final performance prediction model through interpretable Genetic Programming: Integrating learning analytics, educational data mining and theory , 2015, Comput. Hum. Behav..

[40]  Issam Kouatli Student advising decision to predict student's future GPA based on Genetic Fuzzimetric Technique (GFT) , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[41]  Michiel C. van Wezel,et al.  Improved customer choice predictions using ensemble methods , 2005, Eur. J. Oper. Res..