Balancing Accuracy and Transparency in Early Alert Identification of Students at Risk

One of the challenges in implementing early alert systems to identify students at risk of failure or withdrawal is striking a balance between accuracy and transparency, as there are clear benefits to being able to communicate the reason why a student has been identified. An important predictor of future academic success is past performance, which is generally not available for commencing students. Here, we present a work-in-progress in which the full predictive power of an ensemble-based machine learning approach is employed to make predictions for commencing students, while for ongoing students a simple logistic regression method is used.

[1]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Dale R. Tampke Developing, Implementing, and Assessing an Early Alert System , 2013 .

[4]  Richard McElreath,et al.  Statistical Rethinking: A Bayesian Course with Examples in R and Stan , 2015 .

[5]  Mike Sharkey,et al.  Course correction: using analytics to predict course success , 2012, LAK '12.

[6]  Andreas Ziegler,et al.  ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , 2015, 1508.04409.

[7]  Christopher Brooks,et al.  Predictive Modelling in Teaching and Learning , 2017 .

[8]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[9]  Eitel J. M. Lauría,et al.  Early Alert of Academically At-Risk Students: An Open Source Analytics Initiative , 2014, J. Learn. Anal..

[10]  Martin Hlosta,et al.  OU Analyse: analysing at-risk students at The Open University , 2015 .

[11]  Chris Baldwin,et al.  Intervening Early: Attendance and Performance Monitoring as a Trigger for First Year Support in the Biosciences , 2010 .

[12]  Kevin Casey,et al.  Utilizing student activity patterns to predict performance , 2017, International Journal of Educational Technology in Higher Education.

[13]  Andrew Gelman,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2006 .

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Assessing the Impact of the Early Alert Program. AIR 2000 Annual Forum Paper. , 2000 .

[16]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[17]  T. Price,et al.  Improving the quantity and quality of attendance data to enhance student retention , 2005 .