Entropy based bug prediction using support vector regression

Predicting software defects is one of the key areas of research in software engineering. Researchers have devised and implemented a plethora of defect/bug prediction approaches namely code churn, past bugs, refactoring, number of authors, file size and age, etc by measuring the performance in terms of accuracy and complexity. Different mathematical models have also been developed in the literature to monitor the bug occurrence and fixing process. These existing mathematical models named software reliability growth models are either calendar time or testing effort dependent. The occurrence of bugs in the software is mainly due to the continuous changes in the software code. The continuous changes in the software code make the code complex. The complexity of the code changes have already been quantified in terms of entropy as follows in Hassan [9]. In the available literature, few authors have proposed entropy based bug prediction using conventional simple linear regression (SLR) method. In this paper, we have proposed an entropy based bug prediction approach using support vector regression (SVR). We have compared the results of proposed models with the existing one in the literature and have found that the proposed models are good bug predictor as they have shown the significant improvement in their performance.

[1]  Jesús M. González-Barahona,et al.  Towards a Theoretical Model for Software Growth , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[2]  Taghi M. Khoshgoftaar,et al.  Data Mining for Predictors of Software Quality , 1999, Int. J. Softw. Eng. Knowl. Eng..

[3]  Dewayne E. Perry,et al.  Classification and evaluation of defects in a project retrospective , 2002, J. Syst. Softw..

[4]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[5]  Michael R. Lyu,et al.  Handbook of software reliability engineering , 1996 .

[6]  Amrit L. Goel,et al.  Time-Dependent Error-Detection Rate Model for Software Reliability and Other Performance Measures , 1979, IEEE Transactions on Reliability.

[7]  Lionel C. Briand,et al.  Predicting fault-prone components in a java legacy system , 2006, ISESE '06.

[8]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[11]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[12]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[13]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[14]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[15]  John D. Musa,et al.  Software reliability - measurement, prediction, application , 1987, McGraw-Hill series in software engineering and technology.

[16]  Ping Guo,et al.  Support Vector Regression for Software Reliability Growth Modeling and Prediction , 2005, ISNN.

[17]  N. Nagappan,et al.  Static analysis tools as early indicators of pre-release defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[18]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[19]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[20]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[21]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[22]  Richard C. Holt,et al.  Studying the chaos of code development , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[23]  S. Kumar,et al.  Contributions to Hardware and Software Reliability , 1999, Series on Quality, Reliability and Engineering Statistics.

[24]  Richard C. Holt,et al.  The chaos of software development , 2003, Sixth International Workshop on Principles of Software Evolution, 2003. Proceedings..

[25]  Shigeru Yamada,et al.  S-Shaped Reliability Growth Modeling for Software Error Detection , 1983, IEEE Transactions on Reliability.

[26]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[27]  P. K. Kapur,et al.  A software reliability growth model for an error-removal phenomenon , 1992, Softw. Eng. J..

[28]  Mitsuru Ohba,et al.  Inflection S-Shaped Software Reliability Growth Model , 1984 .