Different strokes for different folks: a case study on software metrics for different defect categories

Defect prediction has been evolved with variety of metric sets, and defect types. Researchers found code, churn, and network metrics as significant indicators of defects. However, all metric sets may not be informative for all defect categories such that only one metric type may represent majority of a defect category. Our previous study showed that defect category sensitive prediction models are more successful than general models, since each category has different characteristics in terms of metrics. We extend our previous work, and propose specialized prediction models using churn, code, and network metrics with respect to three defect categories. Results show that churn metrics are the best for predicting all defects. The strength of correlation for code and network metrics varies with defect category: Network metrics have higher correlations than code metrics for defects reported during functional testing and in the field, and vice versa for defects reported during system testing.

[1]  Nachiappan Nagappan,et al.  Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[2]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[3]  Ayse Basar Bener,et al.  Validation of network measures as indicators of defective modules in software systems , 2009, PROMISE '09.

[4]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[5]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[6]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[7]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[8]  Yue Jiang,et al.  Can data transformation help in the detection of fault-prone modules? , 2008, DEFECTS '08.

[9]  Victor R. Basili,et al.  The influence of organizational structure on software quality , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[10]  Laurie A. Williams,et al.  Predicting failures with developer networks and social network analysis , 2008, SIGSOFT '08/FSE-16.

[11]  Elaine J. Weyuker,et al.  Automating algorithms for the identification of fault-prone files , 2007, ISSTA '07.

[12]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007 .

[13]  Ayse Basar Bener,et al.  On the relative value of cross-company and within-company data for defect prediction , 2009, Empirical Software Engineering.

[14]  Frederick P. Brooks,et al.  The Mythical Man-Month: Essays on Softw , 1978 .

[15]  P. Kidwell,et al.  The mythical man-month: Essays on software engineering , 1996, IEEE Annals of the History of Computing.

[16]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[17]  Ayse Basar Bener,et al.  Practical considerations in deploying AI for defect prediction: a case study within the Turkish telecommunication industry , 2009, PROMISE '09.

[18]  Burak Turhan,et al.  Implications of ceiling effects in defect predictors , 2008, PROMISE '08.

[19]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[20]  Bora Caglayan,et al.  Usage of multiple prediction models based on defect categories , 2010, PROMISE '10.

[21]  Brendan Murphy,et al.  Using Historical In-Process and Product Metrics for Early Estimation of Software Failures , 2006, 2006 17th International Symposium on Software Reliability Engineering.

[22]  Elaine J. Weyuker,et al.  Does calling structure information improve the accuracy of fault prediction? , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[23]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[24]  Bora Caglayan,et al.  Merits of using repository metrics in defect prediction for open source projects , 2009, 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development.

[25]  Bora Caglayan,et al.  Prest: An Intelligent Software Metrics Extraction, Analysis and Defect Prediction Tool , 2009, SEKE.