Detection of Desertion Patterns in University Students Using Data Mining Techniques: A Case Study

Student desertion is a phenomenon that affects higher education and academic quality standards. Several causes can lead to this issue, the academic factor being a potential reason. The main objective of this research is to detect dropout patterns in the “Tecnica del Norte” University (Ecuador), based on personal and academic historical data, using predictive classification techniques in data mining. The KDD (Knowledge Discovery in Databases) process was used to determine desertion patterns focused on two approaches: (i) Bayesian, and (ii) Decision Trees, both implemented on Weka. The classifiers performance was quantitatively evaluated using the confusion matrix and quality metrics. The results proved that the technique based on decision trees had slightly better performance than the Bayesian approach on the processed data.

[1]  Fernando De la Torre,et al.  Facing Imbalanced Data--Recommendations for the Use of Performance Metrics , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[2]  Billy Peralta,et al.  Automatic feature selection for desertion and graduation prediction: A chilean case , 2016, 2016 35th International Conference of the Chilean Computer Science Society (SCCC).

[3]  Sergio Luján-Mora,et al.  Datawarehouse design for educational data mining , 2016, 2016 15th International Conference on Information Technology Based Higher Education and Training (ITHET).

[4]  Shailendra Narayan Singh,et al.  Educational data mining and learning analysis , 2017, 2017 7th International Conference on Cloud Computing, Data Science & Engineering - Confluence.

[5]  Francisca López-Granados,et al.  Object- and pixel-based analysis for mapping crops and their agro-environmental associated measures using QuickBird imagery , 2009 .

[6]  Hong Liu,et al.  Use Educational Data Mining to Predict Undergraduate Retention , 2016, 2016 IEEE 16th International Conference on Advanced Learning Technologies (ICALT).

[7]  Vinayak Hegde,et al.  Prediction of students performance using Educational Data Mining , 2016, 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE).

[8]  Manzoor Ahmed Hashmani,et al.  Performance analysis of feature selection algorithm for educational data mining , 2017, 2017 IEEE Conference on Big Data and Analytics (ICBDA).

[9]  Arnulfo Gamaliel Hernandez Gonzalez,et al.  Comparative Study of Algorithms to Predict the Desertion in the Students at the ITSM-Mexico , 2016, IEEE Latin America Transactions.

[10]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.

[11]  Russell G. Congalton,et al.  A review of assessing the accuracy of classifications of remotely sensed data , 1991 .

[12]  Sandra Milena Merchan Rubiano,et al.  Analysis of Data Mining Techniques for Constructing a Predictive Model for Academic Performance , 2016 .

[13]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[14]  Sotiris B. Kotsiantis Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades , 2011, Artificial Intelligence Review.