Early Prediction of Student Success: Mining Students Enrolment Data

This paper explores the socio-demographic variables (age, gender, ethnicity, education, work status, and disability) and study environment (course programme and course block), that may influence persistence or dropout of students at the Open Polytechnic of New Zealand. We examine to what extent these factors, i.e. enrolment data help us in pre-identifying successful and unsuccessful students. The data stored in the Open Polytechnic student management system from 2006 to 2009, covering over 450 students who enrolled to 71150 Information Systems course was used to perform a quantitative analysis of study outcome. Based on a data mining techniques (such as feature selection and classification trees), the most important factors for student success and a profile of the typical successful and unsuccessful students are identified. The empirical results show the following: (i) the most important factors separating successful from unsuccessful students are: ethnicity, course programme and course block; (ii) among classification tree growing methods Classification and Regression Tree (CART) was the most successful in growing the tree with an overall percentage of correct classification of 60.5%; and (iii) both the risk estimated by the cross-validation and the gain diagram suggests that all trees, based only on enrolment data are not quite good in separating successful from unsuccessful students. The implications of these results for academic and administrative staff are discussed.

[1]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[2]  Tiziana Laureti,et al.  AN ECONOMETRIC ANALYSIS OF STUDENT WITHDRAWAL AND PROGRESSION IN POST-REFORM ITALIAN UNIVERSITIES , 2005 .

[3]  P. Murtaugh,et al.  PREDICTING THE RETENTION OF UNIVERSITY STUDENTS , 1999 .

[4]  Jusung Jun Understanding dropout of adult learners in E-learning , 2005 .

[5]  Fadzilah Siraj,et al.  Uncovering Hidden Information Within University's Student Enrollment Data Using Data Mining , 2009, 2009 Third Asia International Conference on Modelling & Simulation.

[6]  Jack Tharp Predicting Persistence of Urban Commuter Campus Students Utilizing Student Background Characteristics from Enrollment Data. , 1998 .

[7]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[8]  Terry T. Ishitani A Longitudinal Approach to Assessing Attrition Behavior Among First-Generation Students: Time-Varying Effects of Pre-College Characteristics , 2003 .

[9]  Mykola Pechenizkiy,et al.  Predicting Students Drop Out: A Case Study , 2009, EDM.

[10]  Qasem A. Al-Radaideh,et al.  Mining Student Data Using Decision Trees , 2006 .

[11]  Ryan S. Baker,et al.  The State of Educational Data Mining in 2009: A Review and Future Visions. , 2009, EDM 2009.

[12]  Samuel DiGangi,et al.  A Data-Mining Approach to Differentiate Predictors of Retention. , 2007 .

[13]  David Kember,et al.  Open Learning Courses for Adults: A Model of Student Progress , 1995 .

[14]  Jing Luan,et al.  Practicing Data Mining for Enrollment Management and Beyond. , 2006 .

[15]  Terrell L. Strayhorn An Examination of the Impact of First-Year Seminars on Correlates of College Student Retention. , 2009 .

[16]  Craig Zimitat,et al.  Future time orientation predicts academic engagement among first-year university students. , 2007, The British journal of educational psychology.

[17]  Paul B. Duby,et al.  A Test and Reconceptualization of a Theoretical Model of College Withdrawal in a Commuter Institution Setting. , 1983 .

[18]  Nadine Meskens,et al.  Predicting Academic Performance by Data Mining Methods , 2007 .

[19]  Phillip A. Pratt,et al.  First Generation College Students: Are They at Greater Risk for Attrition than Their Peers?. , 1989 .

[20]  Terry T. Ishitani Studying Attrition and Degree Completion Behavior among First-Generation College Students in the United States , 2006 .

[21]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[22]  Paulo Cortez,et al.  Using data mining to predict secondary school student performance , 2008 .

[23]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[24]  John Elder,et al.  Handbook of Statistical Analysis and Data Mining Applications , 2009 .

[25]  Barbara S. Metzner,et al.  A Conceptual Model of Nontraditional Undergraduate Student Attrition , 1985 .

[26]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[27]  Olga Lucia Herrera Investigation of the Role of Pre- and Post-admission Variables in Undergraduate Institutional Persistence, using a Markov Student Flow Model , 2006 .

[28]  David Hilton,et al.  Predicting Successful College Experiences: Evidence from a First Year Retention Program , 2007 .

[29]  R. Reason Student Variables that Predict Retention: Recent Research and New Developments , 2003 .

[30]  Subodh Chaudhari,et al.  Enrollment Prediction Models Using Data Mining , 2009 .

[31]  Ormond Simpson,et al.  Predicting student success in open and distance learning , 2006 .