Software metrics thresholds calculation techniques to predict fault-proneness: An empirical comparison

Abstract Context: Nowadays, fault-proneness prediction is an important field of software engineering. It can be used by developers and testers to prioritize tests. This would allow a better allocation of resources, reducing testing time and costs, and improving the effectiveness of software testing. Non-supervised fault-proneness prediction models, especially thresholds-based models, can easily be automated and give valuable insights to developers and testers on the classification performed. Objective: In this paper, we investigated three thresholds calculation techniques that can be used for fault-proneness prediction: ROC Curves, VARL (Value of an Acceptable Risk Level) and Alves rankings. We compared the performance of these techniques with the performance of four machine learning and two clustering based models. Method: Threshold values were calculated on a total of twelve different public datasets: eleven from the PROMISE Repository and another based on the Eclipse project. Thresholds-based models were then constructed using each thresholds calculation technique investigated. For comparison, results were also computed for supervised machine learning and clustering based models. Inter-dataset experimentation between different systems and versions of a same system was performed. Results: Results show that ROC Curves is the best performing method among the three thresholds calculation methods investigated, closely followed by Alves Rankings. VARL method didn’t give valuable results for most of the datasets investigated and was easily outperformed by the two other methods. Results also show that thresholds-based models using ROC Curves outperformed machine learning and clustering based models. Conclusion: The best of the three thresholds calculation techniques for fault-proneness prediction is ROC Curves, but Alves Rankings is a good choice too. In fact, the advantage of Alves Rankings over ROC Curves technique is that it is completely unsupervised and can therefore give pertinent threshold values when fault data is not available.

[1]  Ali Selamat,et al.  An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction , 2015, Knowl. Based Syst..

[2]  Banu Diri,et al.  Clustering and Metrics Thresholds Based Software Fault Prediction of Unlabeled Program Modules , 2009, 2009 Sixth International Conference on Information Technology: New Generations.

[3]  Ömer Faruk Arar,et al.  Deriving thresholds of software metrics to predict faults on open source software: Replicated case studies , 2016, Expert Syst. Appl..

[4]  Arvinder Kaur,et al.  Performance analysis of ensemble learning for predicting defects in open source software , 2014, 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[5]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[6]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[7]  Rainer Koschke,et al.  Effort-Aware Defect Prediction Models , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.

[8]  Tiago L. Alves,et al.  Deriving metric thresholds from benchmark data , 2010, 2010 IEEE International Conference on Software Maintenance.

[9]  Letha H. Etzkorn,et al.  Design and Code Complexity Metrics for OO Classes , 2003 .

[10]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[11]  Bojan Cukic,et al.  Software defect prediction using semi-supervised learning with dimension reduction , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[12]  Carl G. Davis,et al.  A Hierarchical Model for Object-Oriented Design Quality Assessment , 2002, IEEE Trans. Software Eng..

[13]  Raed Shatnawi,et al.  A Quantitative Investigation of the Acceptable Risk Levels of Object-Oriented Metrics in Open-Source Systems , 2010, IEEE Transactions on Software Engineering.

[14]  Ruchika Malhotra,et al.  Fault Prediction Using Statistical and Machine Learning Methods for Improving Software Quality , 2012, J. Inf. Process. Syst..

[15]  Bassey Isong,et al.  A Systematic Review of the Empirical Validation of Object-Oriented Metrics towards Fault-proneness Prediction , 2013, Int. J. Softw. Eng. Knowl. Eng..

[16]  Satwinder Singh,et al.  Object oriented software metrics threshold values at quantitative acceptable risk level , 2014, CSI Transactions on ICT.

[17]  Vandana Bhattacherjee,et al.  Software Fault Prediction Using Quad Tree-Based K-Means Clustering Algorithm , 2012, IEEE Transactions on Knowledge and Data Engineering.

[18]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[19]  Ruchika Malhotra,et al.  Fault prediction considering threshold effects of object‐oriented metrics , 2015, Expert Syst. J. Knowl. Eng..

[20]  R. Bender,et al.  Quantitative risk assessment in epidemiological studies investigating threshold effects , 1999 .

[21]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[22]  Bart Baesens,et al.  Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers , 2013, IEEE Transactions on Software Engineering.

[23]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[24]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[25]  Roberto da Silva Bigonha,et al.  Identifying thresholds for object-oriented software metrics , 2012, J. Syst. Softw..

[26]  R. Shatnawi Improving software fault-prediction for imbalanced data , 2012, 2012 International Conference on Innovations in Information Technology (IIT).

[27]  Tim Menzies,et al.  The \{PROMISE\} Repository of Software Engineering Databases. , 2005 .

[28]  Khaled El Emam,et al.  Thresholds for object-oriented measures , 2000, Proceedings 11th International Symposium on Software Reliability Engineering. ISSRE 2000.

[29]  Marco Tulio Valente,et al.  Extracting relative thresholds for source code metrics , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[30]  Liguo Yu,et al.  Using Negative Binomial Regression Analysis to Predict Software Faults: A Study of Apache Ant , 2012 .

[31]  Oral Alan,et al.  Class noise detection based on software metrics and ROC curves , 2011, Inf. Sci..

[32]  Yuming Zhou,et al.  Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults , 2006, IEEE Transactions on Software Engineering.

[33]  Raed Shatnawi,et al.  Deriving metrics thresholds using log transformation , 2015, J. Softw. Evol. Process..

[34]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[35]  Banu Diri,et al.  Software Fault Prediction of Unlabeled Program Modules , 2009 .

[36]  Wei Li,et al.  Finding software metrics threshold values using ROC curves , 2010 .

[37]  Bart Baesens,et al.  Comprehensible software fault and effort prediction: A data mining approach , 2015, J. Syst. Softw..