Stem-ming the Tide: Predicting STEM attrition using student transcript data

Science, technology, engineering, and math (STEM) fields play growing roles in national and international economies by driving innovation and generating high salary jobs. Yet, the US is lagging behind other highly industrialized nations in terms of STEM education and training. Furthermore, many economic forecasts predict a rising shortage of domestic STEM-trained professions in the US for years to come. One potential solution to this deficit is to decrease the rates at which students leave STEM-related fields in higher education, as currently over half of all students intending to graduate with a STEM degree eventually attrite. However, little quantitative research at scale has looked at causes of STEM attrition, let alone the use of machine learning to examine how well this phenomenon can be predicted. In this paper, we detail our efforts to model and predict dropout from STEM fields using one of the largest known datasets used for research on students at a traditional campus setting. Our results suggest that attrition from STEM fields can be accurately predicted with data that is routinely collected at universities using only information on students' first academic year. We also propose a method to model student STEM intentions for each academic term to better understand the timing of STEM attrition events. We believe these results show great promise in using machine learning to improve STEM retention in traditional and non-traditional campus settings.

[1]  Zlatko J. Kovacic,et al.  Early Prediction of Student Success: Mining Students Enrolment Data , 2010 .

[2]  Xianglei Chen STEM attrition among high-performing college students in the United States: scope and potential causes , 2015 .

[3]  Jo Handelsman,et al.  Increasing Persistence of College Students in STEM , 2013, Science.

[4]  Xianglei Chen STEM Attrition: College Students' Paths into and out of STEM Fields. Statistical Analysis Report. NCES 2014-001. , 2013 .

[5]  Carolyn Penstein Rosé,et al.  “ Turn on , Tune in , Drop out ” : Anticipating student dropouts in Massive Open Online Courses , 2013 .

[6]  Elena Gitin,et al.  Using Big Data to Predict Student Dropouts: Technology Affordances for Research. , 2012, CELDA 2012.

[7]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[8]  Mykola Pechenizkiy,et al.  Predicting Students Drop Out: A Case Study , 2009, EDM.

[9]  Jevin D. West,et al.  Attrition and performance of community college transfers , 2017, PloS one.

[10]  Vassilis Loumos,et al.  Dropout prediction in e-learning courses through the combination of machine learning techniques , 2009, Comput. Educ..

[11]  E. Bettinger To Be or Not to Be: Major Choices in Budding Scientists , 2010 .

[12]  Hal Salzman,et al.  Steady as She Goes? Three Generations of Students through the Science and Engineering Pipeline , 2009 .

[13]  Jevin D. West,et al.  Predicting Student Dropout in Higher Education , 2016, ArXiv.

[14]  Lubos Popelínský,et al.  Predicting drop-out from social behaviour of students , 2012, EDM.

[15]  J. J. Lin,et al.  Student Retention Modelling : An Evaluation of Different Methods and their Impact on Prediction Results , 2009 .

[16]  Dursun Delen,et al.  Predicting Student Attrition with Data Mining Methods , 2011 .

[17]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[18]  Nitesh V. Chawla,et al.  Engagement vs performance: using electronic portfolios to predict first semester engineering student retention , 2014, LAK.

[19]  Sherif A. Halawa,et al.  Dropout Prediction in MOOCs using Learner Activity Features , 2014 .

[20]  Laurence G Moseley,et al.  Predicting who will drop out of nursing courses: a machine learning exercise. , 2008, Nurse education today.

[21]  J. Mervis,et al.  Undergraduate science. Better intro courses seen as key to reducing attrition of STEM majors. , 2010, Science.

[22]  K. Rask Attrition in STEM Fields at a Liberal Arts College: The Importance of Grades and Pre-Collegiate Preferences , 2010 .

[23]  Sudha Ram,et al.  Using Big Data for Predicting Freshmen Retention , 2015, ICIS.

[24]  S. Gates,et al.  Engage to Excel , 2012, Science.