37 Million Compilations: Investigating Novice Programming Mistakes in Large-Scale Student Data

Previous investigations of student errors have typically focused on samples of hundreds of students at individual institutions. This work uses a year's worth of compilation events from over 250,000 students all over the world, taken from the large Blackbox data set. We analyze the frequency, time-to-fix, and spread of errors among users, showing how these factors inter-relate, in addition to their development over the course of the year. These results can inform the design of courses, textbooks and also tools to target the most frequent (or hardest to fix) errors.

[1]  Philip M. Sadler,et al.  The Influence of Teachers’ Knowledge on Student Learning in Middle School Physical Science Classrooms , 2013 .

[2]  Neil Brown,et al.  Investigating novice programming mistakes: educator beliefs vs. student data , 2014, ICER '14.

[3]  J. Jackson,et al.  Identifying Top Java Errors for Novice Programmers , 2005, Proceedings Frontiers in Education 35th Annual Conference.

[4]  Michael Kölling,et al.  Meaningful categorisation of novice programmer errors , 2014, 2014 IEEE Frontiers in Education Conference (FIE) Proceedings.

[5]  Andreas Stefik,et al.  An Empirical Investigation into Programming Language Syntax , 2013, TOCE.

[6]  Matthew C. Jadud,et al.  Methods and tools for exploring novice compilation behaviour , 2006, ICER '06.

[7]  Neil Brown,et al.  Blackbox: a large scale repository of novice programmers' activity , 2014, SIGCSE.

[8]  Ma. Mercedes T. Rodrigo,et al.  A detector for non-literal Java errors , 2010, Koli Calling.

[9]  Ewan D. Tempero,et al.  All syntax errors are not equal , 2012, ITiCSE '12.

[10]  Elliot Soloway,et al.  Papers presented at the first workshop on empirical studies of programmers on Empirical studies of programmers , 1986 .

[11]  Rebecca T. Mercuri,et al.  Identifying and correcting Java programming errors for introductory computer science students , 2003, SIGCSE.

[12]  Ma. Mercedes T. Rodrigo,et al.  Predicting at-risk novice Java programmers through the analysis of online protocols , 2011, ICER.

[13]  Elliot Soloway,et al.  Empirical studies of programmers : papers presented at the First Workshop on Empirical Studies of Programmers, June 5-6, 1986, Washington, D.C. , 1986 .