Data-driven system to predict academic grades and dropout

Nowadays, the role of a tutor is more important than ever to prevent students dropout and improve their academic performance. This work proposes a data-driven system to extract relevant information hidden in the student academic data and, thus, help tutors to offer their pupils a more proactive personal guidance. In particular, our system, based on machine learning techniques, makes predictions of dropout intention and courses grades of students, as well as personalized course recommendations. Moreover, we present different visualizations which help in the interpretation of the results. In the experimental validation, we show that the system obtains promising results with data from the degree studies in Law, Computer Science and Mathematics of the Universitat de Barcelona.

[1]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[2]  Ana Belén Domínguez Gutiérrez,et al.  Permanencia en la universidad: la importancia de un buen comienzo , 2016 .

[3]  T. McKay,et al.  Computer-Tailored Student Support in Introductory Physics , 2015, PloS one.

[4]  Mònica Feixas,et al.  Understanding Catalan university dropout from a cross-national approach , 2015 .

[5]  M. Garcı́a-Artiles,et al.  Un modelo de regresión logística asimétrico que puede explicar la probabilidad de éxito en el rendimiento académico An Asymmetric Logit Model to explain the likelihood of success in academic results , 2015 .

[6]  Rebeca Cerezo Menéndez,et al.  Predicción del abandono universitario: variables explicativas y medidas de prevención , 2015 .

[7]  Vincent Donche,et al.  Profiling First-Year Students in STEM Programs Based on Autonomous Motivation and Academic Self-Concept and Relationship with Academic Achievement , 2014, PloS one.

[8]  Mònica Feixas,et al.  Student dropout rates in Catalan universities: profile and motives for disengagement , 2014 .

[9]  Ahmet Tekin Early Prediction of Students' Grade Point Averages at Graduation: A Data Mining Approach. , 2014 .

[10]  L. C. Duque A framework for analysing higher education performance: students' satisfaction, perceived learning outcomes, and dropout intentions , 2014 .

[11]  A. Gil El abandono académico: análisis y propuestas paliativas. Dos proyectos de la Universidad Politécnica de Madrid , 2014 .

[12]  Juan C. Duque,et al.  Learning outcomes and dropout intentions: an analytical model for Spanish universities , 2013 .

[13]  Mehrbakhsh Nilashi,et al.  Collaborative filtering recommender systems , 2013 .

[14]  Antonello Maruotti,et al.  How individual characteristics affect university students drop-out: a semiparametric mixed-effects model for an Italian case study , 2011 .

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  Hebe Goldenhersh,et al.  Deserción estudiantil: desafíos de la universidad pública en un horizonte de inclusión , 2011 .

[17]  Guy Shani,et al.  Evaluating Recommendation Systems , 2011, Recommender Systems Handbook.

[18]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[19]  H. Wit European Integration in Higher Education: The Bologna Process Towards a European Higher Education Area , 2007 .

[20]  Vincent Tinto,et al.  Research and Practice of Student Retention: What Next? , 2006 .

[21]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[22]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[23]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[24]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[25]  K. Zou,et al.  Correlation and simple linear regression. , 2003, Radiology.

[26]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[27]  C. Y. Peng,et al.  An Introduction to Logistic Regression Analysis and Reporting , 2002 .

[28]  Claude Montmarquette,et al.  The determinants of university dropouts: a bivariate probability model with sample selection , 2001 .

[29]  E. Cohn Education at a Glance--OECD Indicators 1998 Edition;: Organisation for Economic Co-Operation and Development, Paris, 1998, 432 pages, soft cover, $49.00 (FF 295; DM 88) , 2000 .

[30]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[31]  Pat Langley,et al.  Induction of One-Level Decision Trees , 1992, ML.

[32]  W. Knight A Computer Method for Calculating Kendall's Tau with Ungrouped Data , 1966 .