A data mining approach for preventing undergraduate students retention

The Brazilian government has invested funds via the REUNI program to increase the amount of places in public universities. It has also made another important attempt to increase the number of places by approving regulations to legalize the release of the places occupied by retained students for new students. According to these regulations' retention criteria, there were around 50% of places retained by these students. This Article presents a data mining approach for assessing the risk of undergraduate student retention at the end of the second semester of course aiming at supporting the counseling of students to prevent their retention. The approach has been developed for the Federal University of Pernambuco, focusing on 6 of its major courses which involved data transformation from a database of over 400,000 subjects records, the application of logistic regression for risk assessment, and induction of rules for explaining that risk at the counseling process. The risk estimation solution reached Max_KS=0.51 and AUC_ROC=0.84 in performance. Three scenarios of 30%, 50% and 70% of efficacy in the counseling process show that the procedure should be implemented and is economically viable with decision threshold and return of investment varying according to that efficacy.