Which process metrics can significantly improve defect prediction models? An empirical study

The knowledge about the software metrics which serve as defect indicators is vital for the efficient allocation of resources for quality assurance. It is the process metrics, although sometimes difficult to collect, which have recently become popular with regard to defect prediction. However, in order to identify rightly the process metrics which are actually worth collecting, we need the evidence validating their ability to improve the product metric-based defect prediction models. This paper presents an empirical evaluation in which several process metrics were investigated in order to identify the ones which significantly improve the defect prediction models based on product metrics. Data from a wide range of software projects (both, industrial and open source) were collected. The predictions of the models that use only product metrics (simple models) were compared with the predictions of the models which used product metrics, as well as one of the process metrics under scrutiny (advanced models). To decide whether the improvements were significant or not, statistical tests were performed and effect sizes were calculated. The advanced defect prediction models trained on a data set containing product metrics and additionally Number of Distinct Committers (NDC) were significantly better than the simple models without NDC, while the effect size was medium and the probability of superiority (PS) of the advanced models over simple ones was high (p = .016, r = .29, PS = .76), which is a substantial finding useful in defect prediction. A similar result with slightly smaller PS was achieved by the advanced models trained on a data set containing product metrics and additionally all of the investigated process metrics (p = .038, r = -.29, PS = .68). The advanced models trained on a data set containing product metrics and additionally Number of Modified Lines (NML) were significantly better than the simple models without NML, but the effect size was small (p = .038, r = .06). Hence, it is reasonable to recommend the NDC process metric in building the defect prediction models.

[1]  Jan Magott,et al.  QualitySpy: a framework for monitoring software development processes , 2012 .

[2]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[3]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[4]  Ahmed E. Hassan,et al.  Understanding the impact of code and process metrics on post-release defects: a case study on the Eclipse project , 2010, ESEM '10.

[5]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[6]  Alberto Bacchelli,et al.  Are Popular Classes More Defect Prone? , 2010, FASE.

[7]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[8]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[9]  CounsellSteve,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012 .

[10]  Harald C. Gall,et al.  Comparing fine-grained source code changes and code churn for bug prediction , 2011, MSR '11.

[11]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[12]  Andreas Zeller,et al.  Change Bursts as Defect Predictors , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[13]  L. Delbeke Quasi-experimentation - design and analysis issues for field settings - cook,td, campbell,dt , 1980 .

[14]  Martin Pinzger,et al.  Using the gini coefficient for bug prediction in eclipse , 2011, IWPSE-EVOL '11.

[15]  Lech Madeyski,et al.  Test-Driven Development - An Empirical Evaluation of Agile Practice , 2009 .

[16]  R. Rosenthal,et al.  Meta-analysis: recent developments in quantitative methods for literature reviews. , 2001, Annual review of psychology.

[17]  Carl G. Davis,et al.  A Hierarchical Model for Object-Oriented Design Quality Assessment , 2002, IEEE Trans. Software Eng..

[18]  Lech Madeyski,et al.  Software product metrics used to build defect prediction models , 2014 .

[19]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[20]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[21]  Björn Berggren,et al.  Do too many cooks spoil the broth , 2006 .

[22]  Lech Madeyski Is External Code Quality Correlated with Programming Experience or Feelgood Factor? , 2006, XP.

[23]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[24]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[25]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[26]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[27]  Barry W. Boehm,et al.  Understanding and Controlling Software Costs , 1988, IEEE Trans. Software Eng..

[28]  Michael Eichberg,et al.  A Handbook of Software and Systems Engineering , 2009 .

[29]  Stefan Biffl,et al.  Defect Prediction using Combined Product and Project Metrics - A Case Study from the Open Source "Apache" MyFaces Project Family , 2008, 2008 34th Euromicro Conference Software Engineering and Advanced Applications.

[30]  Barbara Paech,et al.  Exploring the relationship of a file's history and its fault-proneness: An empirical method and its application to open source programs , 2010, Inf. Softw. Technol..

[31]  Guilherme Horta Travassos,et al.  Evidence-Based Guidelines to Defect Causal Analysis , 2012, IEEE Software.

[32]  Marco Torchiano,et al.  Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement , 2014, ESEM 2014.

[33]  Victor R. Basili,et al.  The influence of organizational structure on software quality , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[34]  Dewayne E. Perry,et al.  Toward understanding the rhetoric of small source code changes , 2005, IEEE Transactions on Software Engineering.

[35]  Elliot Soloway,et al.  Where the bugs are , 1985, CHI '85.

[36]  Barbara Kitchenham,et al.  What's up with software metrics? - A preliminary mapping study , 2010, J. Syst. Softw..

[37]  Nachiappan Nagappan,et al.  Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[38]  Lech Madeyski,et al.  A review of process metrics in defect prediction studies , 2011 .

[39]  Tore Dybå,et al.  A systematic review of effect size in software engineering experiments , 2007, Inf. Softw. Technol..

[40]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[41]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[42]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[43]  R. Halstead,et al.  Using Process History to Predict Software Quality , 1998, Computer.

[44]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[45]  Gwenn W. Gröndal,et al.  Meta-analytic procedures for social research , 1993 .

[46]  Sandeep Purao,et al.  Product metrics for object-oriented systems , 2003, CSUR.

[47]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[48]  Brian Henderson-Sellers,et al.  Object-Oriented Metrics , 1995, TOOLS.

[49]  Alberto Bacchelli,et al.  On the Impact of Design Flaws on Software Defects , 2010, 2010 10th International Conference on Quality Software.

[50]  Foutse Khomh,et al.  Is it a bug or an enhancement?: a text-based approach to classify change requests , 2008, CASCON '08.

[51]  Ye Yang,et al.  Quantitative analysis of faults and failures with multiple releases of softpm , 2008, ESEM '08.

[52]  Akito Monden,et al.  An analysis of developer metrics for fault prediction , 2010, PROMISE '10.

[53]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[54]  Marian Jureczko,et al.  Using Object-Oriented Design Metrics to Predict Software Defects 1* , 2010 .

[55]  Elaine J. Weyuker,et al.  Looking for bugs in all the right places , 2006, ISSTA '06.

[56]  Elaine J. Weyuker,et al.  Programmer-based fault prediction , 2010, PROMISE '10.

[57]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[58]  N. Fenton,et al.  Project Data Incorporating Qualitative Factors for Improved Software Defect Prediction , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[59]  Elaine J. Weyuker,et al.  The distribution of faults in a large industrial software system , 2002, ISSTA '02.

[60]  Mei-Hwa Chen,et al.  An empirical study on object-oriented metrics , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[61]  Elaine J. Weyuker,et al.  Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models , 2008, Empirical Software Engineering.

[62]  Lionel C. Briand,et al.  Predicting fault-prone components in a java legacy system , 2006, ISESE '06.

[63]  Harald C. Gall,et al.  Cross-project Defect Prediction , 2009 .

[64]  Harald C. Gall,et al.  EQ-Mine: Predicting Short-Term Defects for Software Evolution , 2007, FASE.

[65]  A. Zeller,et al.  If Your Bug Database Could Talk . . . , 2006 .

[66]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[67]  V. Malheiros,et al.  A Visual Text Mining approach for Systematic Reviews , 2007, ESEM 2007.

[68]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[70]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[71]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[72]  R. Grissom,et al.  Effect Sizes for Research : Univariate and Multivariate Applications, Second Edition , 2005 .

[73]  Giovanni Denaro,et al.  An empirical evaluation of fault-proneness models , 2002, ICSE '02.

[74]  E.J. Weyuker,et al.  Using Developer Information as a Factor for Fault Prediction , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[75]  Lucas Layman,et al.  Iterative identification of fault-prone binaries using in-process metrics , 2008, ESEM '08.

[76]  Ali Selamat,et al.  Information and Software Technology , 2014 .

[77]  R. Rosenthal Meta-analytic procedures for social research , 1984 .

[78]  W. Dunlap,et al.  Meta-Analysis of Experiments With Matched Groups or Repeated Measures Designs , 1996 .