On the applicability of evolutionary computation for software defect prediction

Removal of defects is the key in ensuring long-term error free operation of a software system. Although improvements in the software testing process has resulted in better coverage, it is evident that some parts of a software system tend to be more defect prone than the other parts and identification of these parts can greatly benefit the software practitioners in order to deliver high quality maintainable software products. A defect prediction model is built by training a learner using the software metrics. These models can later be used to predict defective classes in a software system. Many studies have been conducted in the past for predicting defective classes in the early phases of the software development. However, the evolutionary computation techniques have not yet been explored for predicting defective classes. The nature of evolutionary computation techniques makes them better suited to the software engineering problems. In this study we explore the predictive ability of the evolutionary computation and hybridized evolutionary computation techniques for defect prediction. This work contributes to the literature by examining the effectiveness of the 15 evolutionary computation and hybridized evolutionary computation techniques to 5 datasets obtained from the Apache Software Foundation using the Defect Collection and Reporting System. The results are evaluated in terms of the values of accuracy. We further compare the evolutionary computation techniques using the Friedman ranking. The results suggest that the defect prediction models built using the evolutionary computation techniques perform well over all the datasets in terms of prediction accuracy.

[1]  Lionel C. Briand,et al.  Replicated Case Studies for Investigating Quality Factors in Object-Oriented Designs , 2001, Empirical Software Engineering.

[2]  Francisco Herrera,et al.  Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study , 2003, IEEE Trans. Evol. Comput..

[3]  Rachel Harrison,et al.  A study of subgroup discovery approaches for defect prediction , 2013, Inf. Softw. Technol..

[4]  Wasif Afzal,et al.  Using Faults-Slip-Through Metric as a Predictor of Fault-Proneness , 2010, 2010 Asia Pacific Software Engineering Conference.

[5]  Ajith Abraham,et al.  Hybrid Evolutionary Algorithms: Methodologies, Architectures, and Reviews , 2007 .

[6]  Hausi A. Müller,et al.  Predicting fault-proneness using OO metrics. An industrial case study , 2002, Proceedings of the Sixth European Conference on Software Maintenance and Reengineering.

[7]  Ester Bernadó-Mansilla,et al.  Accuracy-Based Learning Classifier Systems: Models, Analysis and Applications to Classification Tasks , 2003, Evolutionary Computation.

[8]  Mei-Hwa Chen,et al.  An empirical study on object-oriented metrics , 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403).

[9]  Joanne Bechta Dugan,et al.  Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods , 2007, IEEE Transactions on Software Engineering.

[10]  Yuming Zhou,et al.  On the ability of complexity metrics to predict fault-prone classes in object-oriented systems , 2010, J. Syst. Softw..

[11]  Javam C. Machado,et al.  The prediction of faulty classes using object-oriented design metrics , 2001, J. Syst. Softw..

[12]  Letha H. Etzkorn,et al.  Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile Software Development Processes , 2007, IEEE Transactions on Software Engineering.

[13]  Stewart W. Wilson Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.

[14]  SinghYogesh,et al.  Empirical validation of object-oriented metrics for predicting fault proneness models , 2010 .

[15]  Jaume Bacardit,et al.  Performance and Efficiency of Memetic Pittsburgh Learning Classifier Systems , 2009, Evolutionary Computation.

[16]  Lionel C. Briand,et al.  Exploring the relationships between design measures and software quality in object-oriented systems , 2000, J. Syst. Softw..

[17]  Arvinder Kaur,et al.  Empirical validation of object-oriented metrics for predicting fault proneness models , 2010, Software Quality Journal.

[18]  María José del Jesús,et al.  Induction of fuzzy-rule-based classifiers with evolutionary boosting algorithms , 2004, IEEE Transactions on Fuzzy Systems.

[19]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[20]  ZhangHongyu,et al.  Comments on "Data Mining Static Code Attributes to Learn Defect Predictors" , 2007 .

[21]  Luciano Sánchez,et al.  Induction of descriptive fuzzy classifiers with the Logitboost algorithm , 2006, Soft Comput..

[22]  X. Yao Evolving Artificial Neural Networks , 1999 .

[23]  Pedro Antonio Gutiérrez,et al.  Evolutionary product-unit neural networks classifiers , 2008, Neurocomputing.

[24]  Yuming Zhou,et al.  Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults , 2006, IEEE Transactions on Software Engineering.

[25]  Tiago Ferra de Sousa,et al.  Particle Swarm based Data Mining Algorithms for classification tasks , 2004, Parallel Comput..

[26]  Ruchika Malhotra,et al.  Comparative analysis of statistical and machine learning methods for predicting faulty modules , 2014, Appl. Soft Comput..

[27]  F. J. Martı́nez-Estudilloa,et al.  Evolutionary product-unit neural networks classifiers , 2008 .

[28]  Raed Shatnawi,et al.  The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process , 2008, J. Syst. Softw..

[29]  Mark Harman,et al.  Why the Virtual Nature of Software Makes It Ideal for Search Based Optimization , 2010, FASE.

[30]  Miguel Toro,et al.  Evolutionary learning of hierarchical decision rules , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[31]  Inés Couso,et al.  Combining GP operators with SA search to evolve fuzzy rule based classifiers , 2001, Inf. Sci..

[32]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[33]  Jesús S. Aguilar-Ruiz,et al.  Searching for rules to detect defective modules: A subgroup discovery approach , 2012, Inf. Sci..

[34]  K. K. Aggarwal,et al.  Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study , 2009, Softw. Process. Improv. Pract..

[35]  Jaume Bacardit,et al.  Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System , 2003, GECCO.

[36]  Bart Baesens,et al.  Mining software repositories for comprehensible software fault prediction models , 2008, J. Syst. Softw..

[37]  Yu Liu,et al.  Rule Discovery with Particle Swarm Optimization , 2004, AWCC.

[38]  Ruchika Malhotra,et al.  Investigation of relationship between object-oriented metrics and change proneness , 2013, Int. J. Mach. Learn. Cybern..

[39]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[40]  Jaume Bacardit,et al.  Bloat Control and Generalization Pressure Using the Minimum Description Length Principle for a Pittsburgh Approach Learning Classifier System , 2005, IWLCS.

[41]  S. Kanmani,et al.  Object-oriented software fault prediction using neural networks , 2007, Inf. Softw. Technol..

[42]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[43]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[44]  Luciano Sánchez,et al.  Boosting fuzzy rules in classification problems under single‐winner inference , 2007, Int. J. Intell. Syst..