Inferring Use Cases from Unit Testing

We present techniques for analyzing score matrices of unit tests outcomes from snapshots of CS2 student code throughout the development cycle. This analysis includes a technique for estimating the number of fundamentally different features in the unit tests, as well as a survey of which algorithms can best match human intuition when grouping tests into related clusters. Unlike previous investigations into topic clustering of score matrices, we successfully identify algorithms that perform with good accuracy on this task. We also discuss the data gathered by the Marmoset system, which has been used to collect over 100,000 snapshots of student programs and associated test results.

[1]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[2]  Tiffany Barnes,et al.  The Q-matrix Method: Mining Student Response Data for Knowledge , 2005 .

[3]  David Hovemeyer,et al.  An Eclipse-based course project snapshot and submission system , 2004, eclipse '04.

[4]  Mark Guzdial,et al.  A multi-national, multi-institutional study of assessment of programming skills of first-year CS students , 2001, ITiCSE-WGR '01.

[5]  D. Defays,et al.  An Efficient Algorithm for a Complete Link Method , 1977, Comput. J..

[6]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[7]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[8]  William B. Gruener,et al.  A study of the first course in computers , 1978, SIGCSE '78.

[9]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[10]  Michael Kölling,et al.  Introducing unit testing with BlueJ , 2003, ITiCSE '03.

[11]  J. Nash Compact Numerical Methods for Computers , 2018 .

[12]  David Hovemeyer,et al.  Software repository mining with Marmoset , 2005, MSR.

[13]  David Hovemeyer,et al.  Evaluating and tuning a static analysis to find null pointer bugs , 2005, PASTE '05.

[14]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[15]  David Hovemeyer,et al.  Experiences with marmoset: designing and using an advanced submission and testing system for programming courses , 2006, ITiCSE.

[16]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[17]  Jan Poland,et al.  Amplifying the Block Matrix Structure for Spectral Clustering. , 2005 .