Applying Data Mining Techniques to Predict Student Dropout: A Case Study

The prevention of students dropping out is considered very important in many educational institutions. In this paper we describe the results of an educational data analytics case study focused on detection of dropout of System Engineering (SE) undergraduate students after 7 years of enrollment in a Colombian university. Original data is extended and enriched using a feature engineering process. Our experimental results showed that simple algorithms achieve reliable levels of accuracy to identify predictors of dropout. Decision Trees, Logistic Regression and Na¨ıve Bayes results were compared in order to propose the best option. Also, Watson Analytics is evaluated to establish the usability of the service for a non expert user. Main results are presented in order to decrease the dropout rate by identifying potential causes. In addition, we present some findings related to data quality to improve the students data collection process.

[1]  Qasem A. Al-Radaideh,et al.  Mining Student Data Using Decision Trees , 2006 .

[2]  Dharminder Kumar,et al.  Mining Students' Data for Prediction Performance , 2014, 2014 Fourth International Conference on Advanced Computing & Communication Technologies.

[3]  Ahmet Tekin Early Prediction of Students' Grade Point Averages at Graduation: A Data Mining Approach. , 2014 .

[4]  Habib Fardoun,et al.  Early dropout prediction using data mining: a case study with high school students , 2016, Expert Syst. J. Knowl. Eng..

[5]  Saurabh Pal,et al.  Mining Educational Data to Analyze Students' Performance , 2012, ArXiv.

[6]  Erman Yukselturk,et al.  Predicting Dropout Student: An Application of Data Mining Methods in an Online Education Program , 2014 .

[7]  Jevin D. West,et al.  Predicting Student Dropout in Higher Education , 2016, ArXiv.

[8]  Zlatko J. Kovacic,et al.  Early Prediction of Student Success: Mining Students Enrolment Data , 2010 .

[9]  Vincent Tinto Dropout from Higher Education: A Theoretical Synthesis of Recent Research , 1975 .

[10]  Dheeraj Raju,et al.  Exploring Student Characteristics of Retention that Lead to Graduation in Higher Education Using Data Mining Models , 2015 .

[11]  Saurabh Pal,et al.  Data Mining: A prediction for performance improvement using classification , 2012, ArXiv.

[12]  Dorina Kabakchieva,et al.  Predicting Student Performance by Using Data Mining Methods for Classification , 2013 .

[13]  Vinayak Hegde,et al.  Prediction of students performance using Educational Data Mining , 2016, 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE).

[14]  Mykola Pechenizkiy,et al.  Predicting Students Drop Out: A Case Study , 2009, EDM.