A Comparative Study to Benchmark Cross-Project Defect Prediction Approaches

Cross-Project Defect Prediction (CPDP) as a means to focus quality assurance of software projects was under heavy investigation in recent years. However, within the current state-of-the-art it is unclear which of the many proposals performs best due to a lack of replication of results and diverse experiment setups that utilize different performance metrics and are based on different underlying data. Within this article, we provide a benchmark for CPDP. We replicate 24 approaches proposed by researchers between 2008 and 2015 and evaluate their performance on software products from five different data sets. Based on our benchmark, we determined that an approach proposed by Camargo Cruz and Ochimizu (2009) based on data standardization performs best and is always ranked among the statistically significant best results for all metrics and data sets. Approaches proposed by Turhan et al. (2009), Menzies et al. (2011), and Watanabe et al. (2008) are also nearly always among the best results. Moreover, we determined that predictions only seldom achieve a high performance of 0.75 recall, precision, and accuracy. Thus, CPDP still has not reached a point where the performance of the results is sufficient for the application in practice.

[1]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[2]  Premkumar T. Devanbu,et al.  Recalling the "imprecision" of cross-project defect prediction , 2012, SIGSOFT FSE.

[3]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[4]  Bruce Christianson,et al.  The misuse of the NASA metrics data program data sets for automated software defect prediction , 2011, EASE.

[5]  Rongxin Wu,et al.  ReLink: recovering links between bugs and changes , 2011, ESEC/FSE '11.

[6]  Ayse Basar Bener,et al.  On the relative value of cross-company and within-company data for defect prediction , 2009, Empirical Software Engineering.

[7]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[8]  Lionel C. Briand,et al.  Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects , 2002, IEEE Trans. Software Eng..

[9]  Jens Grabowski,et al.  [Journal First] A Comparative Study to Benchmark Cross-Project Defect Prediction Approaches , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[10]  Ayse Basar Bener,et al.  Empirical evaluation of the effects of mixed project data on learning defect predictors , 2013, Inf. Softw. Technol..

[11]  Ayse Basar Bener,et al.  Empirical Evaluation of Mixed-Project Defect Prediction Models , 2011, 2011 37th EUROMICRO Conference on Software Engineering and Advanced Applications.

[12]  Sashank Dara,et al.  Online Defect Prediction for Imbalanced Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[13]  Thomas R. Ioerger,et al.  Enhancing Learning using Feature and Example selection , 2003 .

[14]  Maurice H. Halstead,et al.  Elements of software science (Operating and programming systems series) , 1977 .

[15]  Tim Menzies,et al.  Balancing Privacy and Utility in Cross-Company Defect Prediction , 2013, IEEE Transactions on Software Engineering.

[16]  Forrest Shull,et al.  Local versus Global Lessons for Defect Prediction and Effort Estimation , 2013, IEEE Transactions on Software Engineering.

[17]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[18]  Tim Menzies,et al.  Better cross company defect prediction , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[19]  Tim Menzies,et al.  Learning from Open-Source Projects: An Empirical Study on Defect Prediction , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[20]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[21]  O. J. Dunn Multiple Comparisons among Means , 1961 .

[22]  Osamu Mizuno,et al.  A Cross-Project Evaluation of Text-Based Fault-Prone Module Prediction , 2014, 2014 6th International Workshop on Empirical Software Engineering in Practice.

[23]  Taghi M. Khoshgoftaar,et al.  Evolutionary Optimization of Software Quality Modeling with Multiple Repositories , 2010, IEEE Transactions on Software Engineering.

[24]  Tim Menzies,et al.  Privacy and utility for defect prediction: Experiments with MORPH , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[25]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[26]  Gerardo Canfora,et al.  Defect prediction as a multiobjective optimization problem , 2015, Softw. Test. Verification Reliab..

[27]  Tim Menzies,et al.  Heterogeneous Defect Prediction , 2015, IEEE Transactions on Software Engineering.

[28]  Jaechang Nam,et al.  CLAMI: Defect Prediction on Unlabeled Datasets , 2015, ASE 2015.

[29]  Andreas Zeller,et al.  Predicting defects using change genealogies , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  Bojan Cukic,et al.  Predicting more from less: Synergies of learning , 2013, 2013 2nd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE).

[32]  Gerardo Canfora,et al.  Multi-objective Cross-Project Defect Prediction , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[33]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[34]  Steffen Herbold,et al.  CrossPare: A Tool for Benchmarking Cross-Project Defect Predictions , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW).

[35]  K. Johana,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2022 .

[36]  Andrea De Lucia,et al.  Cross-project defect prediction models: L'Union fait la force , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[37]  Koichiro Ochimizu,et al.  Towards logistic regression models for predicting fault-prone code across software projects , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[38]  Lucas Layman,et al.  LACE2: Better Privacy-Preserving Data Sharing for Cross Project Defect Prediction , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[39]  Shane McIntosh,et al.  Automated Parameter Optimization of Classification Techniques for Defect Prediction Models , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[40]  Ye Yang,et al.  An investigation on the feasibility of cross-project defect prediction , 2012, Automated Software Engineering.

[41]  Qing Sun,et al.  Software defect prediction via transfer learning based neural network , 2015, 2015 First International Conference on Reliability Systems Engineering (ICRSE).

[42]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[43]  Shane McIntosh,et al.  Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[44]  Akito Monden,et al.  An Ensemble Approach of Simple Regression Models to Cross-Project Fault Prediction , 2012, 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.

[45]  Audris Mockus,et al.  Towards building a universal defect prediction model , 2014, MSR 2014.

[46]  Steffen Herbold,et al.  A systematic mapping study on cross-project defect prediction , 2017, ArXiv.

[47]  Xiao Liu,et al.  An empirical study on software defect prediction with a simplified metric set , 2014, Inf. Softw. Technol..

[48]  Steffen Herbold,et al.  Training data selection for cross-project defect prediction , 2013, PROMISE.

[49]  Qinbao Song,et al.  Data Quality: Some Comments on the NASA Software Defect Datasets , 2013, IEEE Transactions on Software Engineering.

[50]  Burak Turhan,et al.  Implications of ceiling effects in defect predictors , 2008, PROMISE '08.

[51]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[52]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[53]  Jongmoon Baik,et al.  A Hybrid Instance Selection Using Nearest-Neighbor for Cross-Project Defect Prediction , 2015, Journal of Computer Science and Technology.

[54]  Lech Madeyski,et al.  Cross-Project Defect Prediction With Respect To Code Ownership Model: An Empirical Study , 2015, e Informatica Softw. Eng. J..

[55]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[56]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[57]  Franz Wotawa,et al.  Novel Insights on Cross Project Fault Prediction Applied to Automotive Software , 2015, ICTSS.

[58]  Sousuke Amasaki,et al.  Improving Cross-Project Defect Prediction Methods with Data Simplification , 2015, 2015 41st Euromicro Conference on Software Engineering and Advanced Applications.

[59]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[60]  D. Cox The Regression Analysis of Binary Sequences , 2017 .

[61]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .

[62]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[63]  Zhaowei Shang,et al.  Negative samples reduction in cross-company software defects prediction , 2015, Inf. Softw. Technol..

[64]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[65]  Tim Menzies,et al.  Local vs. global models for effort estimation and defect prediction , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[66]  Jongmoon Baik,et al.  Value-cognitive boosting with a support vector machine for cross-project defect prediction , 2014, Empirical Software Engineering.

[67]  Naoyasu Ubayashi,et al.  Studying just-in-time defect prediction using cross-project models , 2015, Empirical Software Engineering.

[68]  Fabian Trautsch,et al.  Adressing Problems with External Validity of Repository Mining Studies Through a Smart Data Platform , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[69]  Jongmoon Baik,et al.  A transfer cost-sensitive boosting approach for cross-project defect prediction , 2017, Software Quality Journal.

[70]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[71]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[72]  Guangchun Luo,et al.  Transfer learning for cross-company software defect prediction , 2012, Inf. Softw. Technol..

[73]  David Lo,et al.  An Empirical Study of Classifier Combination for Cross-Project Defect Prediction , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[74]  Sinno Jialin Pan,et al.  Transfer defect learning , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[75]  Burak Turhan,et al.  On the dataset shift problem in software engineering prediction models , 2011, Empirical Software Engineering.

[76]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[77]  Taghi M. Khoshgoftaar,et al.  Software quality analysis by combining multiple projects and learners , 2008, Software Quality Journal.

[78]  Haruhiko Kaiya,et al.  Adapting a fault prediction model to allow inter languagereuse , 2008, PROMISE '08.

[79]  T. Pohlert The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR) , 2016 .

[80]  Naoyasu Ubayashi,et al.  An empirical study of just-in-time defect prediction using cross-project models , 2014, MSR 2014.

[81]  Sousuke Amasaki,et al.  Improving Relevancy Filter Methods for Cross-Project Defect Prediction , 2015, 2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence.

[82]  Brian Henderson-Sellers,et al.  Object-Oriented Metrics , 1995, TOOLS.

[83]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[84]  Jens Grabowski,et al.  Global vs. local models for cross-project defect prediction , 2017, Empirical Software Engineering.

[85]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[86]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.