Predicting Performance in an Introductory Programming Course by Logging and Analyzing Student Programming Behavior

The high failure rates of many programming courses means there is a need to identify struggling students as early as possible. Prior research has focused upon using a set of tests to assess the use of a student's demographic, psychological and cognitive traits as predictors of performance. But these traits are static in nature, and therefore fail to encapsulate changes in a student's learning progress over the duration of a course. In this paper we present a new approach for predicting a student's performance in a programming course, based upon analyzing directly logged data, describing various aspects of their ordinary programming behavior. An evaluation using data logged from a sample of 45 programming students at our University, showed that our approach was an excellent early predictor of performance, explaining 42.49% of the variance in coursework marks - double the explanatory power when compared to the closest related technique in the literature.

[1]  Rand R. Wilcox,et al.  Fundamentals of Modern Statistical Methods , 2001 .

[2]  Vicki L. Sauter,et al.  Predicting computer programming skill , 1986 .

[3]  E. A. Unger,et al.  A predictor for success in an introductory programming class based upon abstract reasoning development , 1983, SIGCSE '83.

[4]  Gail E. Kaiser,et al.  Retina: helping students and instructors based on observed programming activities , 2009, SIGCSE '09.

[5]  Ryan Shaun Joazeiro de Baker,et al.  Affective and behavioral predictors of novice programmer achievement , 2009, ITiCSE.

[6]  C. K. Capstick,et al.  Predicting performance by university students in introductory computing courses , 1975, SGCS.

[7]  Jens Bennedsen,et al.  Failure rates in introductory programming , 2007, SGCS.

[8]  R. Wilcox Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy , 2001 .

[9]  Ronan G. Reilly,et al.  Predicting introductory programming performance: A multi-institutional multivariate study , 2006, Comput. Sci. Educ..

[10]  Jamie L. Godwin,et al.  Classification and Detection of Electrical Control System Faults Through Scada Data Analysis , 2013 .

[11]  Rynson W. H. Lau,et al.  Learning Programming Languages through Corrective Feedback and Concept Visualisation , 2011, ICWL.

[12]  Wilfred W. F. Lau,et al.  Modelling programming performance: Beyond the influence of learner characteristics , 2011, Comput. Educ..

[13]  Ma. Mercedes T. Rodrigo,et al.  Predicting at-risk novice Java programmers through the analysis of online protocols , 2011, ICER.

[14]  Matthew C. Jadud,et al.  Methods and tools for exploring novice compilation behaviour , 2006, ICER '06.

[15]  Philip R. Ventura,et al.  Identifying predictors of success for an objects-first CS1 , 2005, Comput. Sci. Educ..

[16]  Kenneth L. Whipkey Identifying predictors of programming skill , 1984, SGCS.

[17]  Ewan D. Tempero,et al.  All syntax errors are not equal , 2012, ITiCSE '12.

[18]  Frederick W. B. Li,et al.  BlueFix: Using Crowd-Sourced Feedback to Support Programming Students in Error Diagnosis and Repair , 2012, ICWL.

[19]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[20]  Jens Bennedsen,et al.  Abstraction ability as an indicator of success for learning object-oriented programming? , 2006, SGCS.