Assessing Software Quality by Program Clustering and Defect Prediction

Many empirical studies have shown that defect prediction models built on product metrics can be used to assess the quality of software modules. So far, most methods proposed in this direction predict defects by class or file. In this paper, we propose a novel software defect prediction method based on functional clusters of programs to improve the performance, especially the effort-aware performance, of defect prediction. In the method, we use proper-grained and problem-oriented program clusters as the basic units of defect prediction. To evaluate the effectiveness of the method, we conducted an experimental study on Eclipse 3.0. We found that, comparing with class-based models, cluster-based prediction models can significantly improve the recall (from 31.6% to 99.2%) and precision (from 73.8% to 91.6%) of defect prediction. According to the effort-aware evaluation, the effort needed to review code to find half of the total defects can be reduced by 6% if using cluster-based prediction models.

[1]  Akito Monden,et al.  Revisiting common bug prediction findings using effort-aware models , 2010, 2010 IEEE International Conference on Software Maintenance.

[2]  Andreas Zeller,et al.  Change Bursts as Defect Predictors , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[3]  Krishna Arul,et al.  Six Sigma for Software Application of Hypothesis Tests to Software Data , 2004, Software Quality Journal.

[4]  Alistair Cockburn,et al.  Agile Software Development , 2001 .

[5]  Onaiza Maqbool,et al.  Hierarchical Clustering for Software Architecture Recovery , 2007, IEEE Transactions on Software Engineering.

[6]  Rainer Koschke,et al.  Effort-Aware Defect Prediction Models , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.

[7]  Nicolas Anquetil,et al.  Experiments with clustering as a software remodularization method , 1999, Sixth Working Conference on Reverse Engineering (Cat. No.PR00303).

[8]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[9]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[10]  Daniela E. Damian,et al.  Predicting build failures using social network analysis on developer communication , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[11]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[12]  Andreas Zeller,et al.  Predicting component failures at design time , 2006, ISESE '06.

[13]  Stéphane Ducasse,et al.  Semantic clustering: Identifying topics in source code , 2007, Inf. Softw. Technol..

[14]  Yuanyuan Zhou,et al.  Have things changed now?: an empirical study of bug characteristics in modern open source software , 2006, ASID '06.

[15]  Robert C. Martin Agile Software Development, Principles, Patterns, and Practices , 2002 .