Predicting software build failure using source code metrics

In this paper, we describe the extraction of source code metrics from the Jazz repository and the application of data mining techniques to identify the most useful of those metrics for predicting the success or failure of an attempt to construct a working instance of the software product. We present results from a study using the J48 classification method used in conjunction with a number of attribute selection strategies applied to a set of source code metrics calculated from the code base at the beginning of a build cycle. The results indicate that only a relatively small number of the available software metrics that we considered have any significance for predicting the outcome of a build. These significant metrics are discussed and implication of the results discussed, particularly the relative difficulty of being able to predict failed build attempts. The results also indicate that there is some scope for predicting the outcomes of an attempt to construct a working instance of the software product by analysing the characteristics of the source code to be changed. This provides the opportunity for software project managers to estimate the risk exposure of the planned changes in the build prior to commencing the coding activities.

[1]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[2]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[3]  Daniela E. Damian,et al.  Predicting build failures using social network analysis on developer communication , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[4]  Christos Faloutsos,et al.  Detecting Fraudulent Personalities in Networks of Online Auctioneers , 2006, PKDD.

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Daniela E. Damian,et al.  Global Software Development and Delay: Does Distance Still Matter? , 2008, 2008 IEEE International Conference on Global Software Engineering.

[7]  Audris Mockus,et al.  Predicting risk of software changes , 2000, Bell Labs Technical Journal.

[8]  Andreas Zeller,et al.  Mining the Jazz repository: Challenges and opportunities , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[9]  Frank Maurer,et al.  Requirements attributes to predict requirements related defects , 2010, CASCON.

[10]  Russel Pears,et al.  Mining Software Metrics from Jazz , 2011, 2011 Ninth International Conference on Software Engineering Research, Management and Applications.

[11]  Gabriele Manduchi,et al.  Measuring software evolution at a nuclear fusion experiment site: a test case for the applicability of OO and reuse metrics in software characterization , 2002, Inf. Softw. Technol..

[12]  Sandro Morasca,et al.  Deriving models of software fault-proneness , 2002, SEKE '02.

[13]  Thomas Zimmermann,et al.  Analytics for software development , 2010, FoSER '10.