Multiple-classifiers in software quality engineering: Combining predictors to improve software fault prediction ability

Abstract Software development projects require a critical and costly testing phase to investigate efficiency of the resultant product. As the size and complexity of project increases, manual prediction of software defects becomes a time consuming and costly task. An alternative to manual defect prediction is the use of automated predictors to focus on faulty modules and let the software engineer to examine the defective part with more detail. In this aspect, improved fault predictors will always find a software quality application project to be applied on. There are many base predictors tested-designed for this purpose. However, base predictors might be combined with an ensemble strategy to further improve to increase their performance, particularly fault-detection abilities. The aim of this study is to demonstrate fault-prediction performance of ten ensemble predictors compared to baseline predictors empirically. In our experiments, we used 15 software projects from PROMISE repository and we evaluated the fault-detection performance of algorithms in terms of F-measure (FM) and Area under the Receiver Operating Characteristics (ROC) Curve (AUC). The results of experiments demonstrated that ensemble predictors might improve fault detection performance to some extent.

[1]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Tim Menzies,et al.  How good is your blind spot sampling policy , 2004, Eighth IEEE International Symposium on High Assurance Systems Engineering, 2004. Proceedings..

[4]  Ian H. Witten,et al.  Stacking Bagged and Dagged Models , 1997, ICML.

[5]  Taghi M. Khoshgoftaar,et al.  How Many Software Metrics Should be Selected for Defect Prediction? , 2011, FLAIRS.

[6]  Cagatay Catal,et al.  Software fault prediction: A literature review and current trends , 2011, Expert Syst. Appl..

[7]  Mario Luca Bernardi,et al.  A Multi-source Machine Learning Approach to Predict Defect Prone Components , 2018 .

[8]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[9]  Ayse Basar Bener,et al.  AI-Based Software Defect Predictors : Applications and Benefits in a Case Study , 2011 .

[10]  Taghi M. Khoshgoftaar,et al.  Analyzing software measurement data with clustering techniques , 2004, IEEE Intelligent Systems.

[11]  Alexandre Boucher,et al.  Using Software Metrics Thresholds to Predict Fault-Prone Classes in Object-Oriented Software , 2016, 2016 4th Intl Conf on Applied Computing and Information Technology/3rd Intl Conf on Computational Science/Intelligence and Applied Informatics/1st Intl Conf on Big Data, Cloud Computing, Data Science & Engineering (ACIT-CSII-BCD).

[12]  Mohammad Alshayeb,et al.  Software defect prediction using ensemble learning on selected features , 2015, Inf. Softw. Technol..

[13]  Tim Menzies,et al.  Assessing Predictors of Software Defects , 2004 .

[14]  Ayse Basar Bener,et al.  Ensemble of software defect predictors: a case study , 2008, ESEM '08.

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[16]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[17]  Arvinder Kaur,et al.  An empirical evaluation of classification algorithms for fault prediction in open source projects , 2018, J. King Saud Univ. Comput. Inf. Sci..

[18]  Hayri Sever,et al.  Performance Evaluation of the Machine Learning Algorithms Used in Inference Mechanism of a Medical Decision Support System , 2014, TheScientificWorldJournal.

[19]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[20]  Ian H. Witten,et al.  Stacked generalization: when does it work? , 1997, IJCAI 1997.

[21]  Lionel C. Briand,et al.  A systematic and comprehensive investigation of methods to build and evaluate fault prediction models , 2010, J. Syst. Softw..

[22]  Alexey Tsymbal,et al.  Random Subspacing for Regression Ensembles , 2004, FLAIRS.

[23]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[24]  Xiuzhen Zhang,et al.  Comments on "Data Mining Static Code Attributes to Learn Defect Predictors" , 2007, IEEE Trans. Software Eng..

[25]  Ayse Basar Bener,et al.  Data mining source code for locating software bugs: A case study in telecommunication industry , 2009, Expert Syst. Appl..

[26]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[27]  H. Altay Güvenir,et al.  Classification by Voting Feature Intervals , 1997, ECML.

[28]  Ayse Basar Bener,et al.  Practical considerations in deploying statistical methods for defect prediction: A case study within the Turkish telecommunications industry , 2010, Inf. Softw. Technol..

[29]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007 .

[30]  Ke Qin,et al.  Applying Variant Variable Regularized Logistic Regression for Modeling Software Defect Predictor , 2016 .

[31]  Arashdeep Kaur,et al.  An empirical approach for software fault prediction , 2010, 2010 5th International Conference on Industrial and Information Systems.

[32]  Nagy Ramadan,et al.  Early Prediction of Software Defect using Ensemble Learning: A Comparative Study , 2018, International Journal of Computer Applications.

[33]  Yue Jiang,et al.  Techniques for evaluating fault prediction models , 2008, Empirical Software Engineering.

[34]  Ali Selamat,et al.  A survey on software fault detection based on different prediction approaches , 2014, Vietnam Journal of Computer Science.

[35]  Iker Gondra,et al.  Applying machine learning to software fault-proneness prediction , 2008, J. Syst. Softw..

[36]  Honggang Wang,et al.  Empirical Evaluation of Classifiers for Software Risk Management , 2009, Int. J. Inf. Technol. Decis. Mak..

[37]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[38]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[39]  Ali Athar Khan,et al.  Comparison of Software Complexity Metrics , 2016 .

[40]  Yi Peng,et al.  Ensemble of Software Defect Predictors: an AHP-Based Evaluation Method , 2011, Int. J. Inf. Technol. Decis. Mak..

[41]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[44]  Kristina Machova,et al.  A Bagging Method using Decision Trees in the Role of Base Classifiers , 2006 .

[45]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[46]  Ayse Basar Bener,et al.  An industrial case study of classifier ensembles for locating software defects , 2011, Software Quality Journal.

[47]  V. Jayaraj,et al.  Software Defect Prediction using Boosting Techniques , 2013 .