Thresholds for Size and Complexity Metrics: A Case Study from the Perspective of Defect Density

Practical guidelines on what code has better quality are in great demand. For example, it is reasonable to expect the most complex code to be buggy. Structuring code into reasonably sized files and classes also appears to be prudent. Many attempts to determine (or declare) risk thresholds for various code metrics have been made. In this paper we want to examine the applicability of such thresholds. Hence, we replicate a recently published technique for calculating metric thresholds to determine high-risk files based on code size (LOC and number of methods), and complexity (cyclomatic complexity and module interface coupling) using a very large set of open and closed source projects written primarily in Java. We relate the threshold-derived risk to (a) the probability that a file would have a defect, and (b) the defect density of the files in the high-risk group. We find that the probability of a file having a defect is higher in the very high-risk group with a few exceptions. This is particularly pronounced when using size thresholds. Surprisingly, the defect density was uniformly lower in the very high-risk group of files. Our results suggest that, as expected, less code is associated with fewer defects. However, the same amount of code in large and complex files was associated with fewer defects than when located in smaller and less complex files. Hence we conclude that risk thresholds for size and complexity metrics have to be used with caution if at all. Our findings have immediate practical implications: the redistribution of Java code into smaller and less complex files may be counterproductive.

[1]  Tiago L. Alves,et al.  Deriving metric thresholds from benchmark data , 2010, 2010 IEEE International Conference on Software Maintenance.

[2]  Hongfang Liu,et al.  Testing the theory of relative defect proneness for closed-source software , 2010, Empirical Software Engineering.

[3]  Carol Withrow,et al.  Prediction and control of ADA software defects , 1990, J. Syst. Softw..

[4]  Gerardo Canfora,et al.  Tuning anonymity level for assuring high data quality: an empirical study. , 2007, ESEM 2007.

[5]  Raed Shatnawi,et al.  Finding software metrics threshold values using ROC curves , 2010, J. Softw. Maintenance Res. Pract..

[6]  Emad Shihab,et al.  An Exploration of Challenges Limiting Pragmatic Software Defect Prediction , 2012 .

[7]  Khaled El Emam,et al.  Thresholds for object-oriented measures , 2000, Proceedings 11th International Symposium on Software Reliability Engineering. ISSRE 2000.

[8]  Audris Mockus,et al.  Quantifying the Effect of Code Smells on Maintenance Effort , 2013, IEEE Transactions on Software Engineering.

[9]  Diomidis Spinellis A tale of four kernels , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[10]  Khaled El Emam,et al.  The Optimal Class Size for Object-Oriented Software , 2002, IEEE Trans. Software Eng..

[11]  Michael English,et al.  Fine-Grained Software Metrics in Practice , 2007, ESEM 2007.

[12]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[13]  Brian A. Nejmeh,et al.  NPATH: a measure of execution path complexity and its applications , 1988, CACM.

[14]  Les Hatton,et al.  Reexamining the Fault Density-Component Size Connection , 1997, IEEE Softw..

[15]  Oscar Nierstrasz,et al.  Comparative analysis of evolving software systems using the Gini coefficient , 2009, 2009 IEEE International Conference on Software Maintenance.

[16]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation , 1993 .

[17]  Audris Mockus,et al.  Amassing and indexing a large sample of version control systems: Towards the census of public source code history , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[18]  Paul W. Oman,et al.  The application of software maintainability models in industrial software systems , 1995, J. Syst. Softw..

[19]  Les Hatton,et al.  Does OO Sync with How We Think? , 1998, IEEE Softw..

[20]  Doo-Hwan Bae,et al.  An Approach to Outlier Detection of Software Measurement Data using the K-means Clustering Method , 2007, ESEM 2007.

[21]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[22]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation0 , 1984, CACM.

[23]  Claus Lewerentz,et al.  Applying design-metrics to object-oriented frameworks , 1996, Proceedings of the 3rd International Software Metrics Symposium.

[24]  Hongfang Liu,et al.  An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules , 2009, IEEE Transactions on Software Engineering.

[25]  Akito Monden,et al.  Comparison of Outlier Detection Methods in Fault-proneness Models , 2007, ESEM 2007.

[26]  Audris Mockus,et al.  Questioning software maintenance metrics: A comparative case study , 2012, Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.

[27]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[28]  Ahmed E. Hassan,et al.  Explaining software defects using topic models , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[29]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[30]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[31]  Elaine J. Weyuker,et al.  Looking for bugs in all the right places , 2006, ISSTA '06.

[32]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[33]  M. Fowler Improving the Design of Existing Code , 2000 .

[34]  Akito Monden,et al.  Revisiting common bug prediction findings using effort-aware models , 2010, 2010 IEEE International Conference on Software Maintenance.