What Can Fault Prediction Do for YOU?

It would obviously be very valuable to know in advance which files in the next release of a large software system are most likely to contain the largest numbers of faults. This is true whether the goal is to validate the system by testing or formally verifying it, or by using some hybrid approach. To accomplish this, we developed negative binomial regression models and used them to predict the expected number of faults in each file of the next release of a system. The predictions are based on code characteristics and fault and modification history data. This paper discusses what we have learned from applying the model to several large industrial systems, each with multiple years of field exposure. It also discusses our success in making accurate predictions and some of the issues that had to be considered.

[1]  Les Hatton,et al.  Reexamining the Fault Density-Component Size Connection , 1997, IEEE Softw..

[2]  Andreas Zeller,et al.  Mining metrics to predict component failures , 2006, ICSE.

[3]  Audris Mockus,et al.  Predicting risk of software changes , 2000, Bell Labs Technical Journal.

[4]  Maurizio Pighin,et al.  An empirical analysis of fault persistence through software releases , 2003, 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings..

[5]  Elaine J. Weyuker,et al.  Automating algorithms for the identification of fault-prone files , 2007, ISSTA '07.

[6]  Edward N. Adams,et al.  Optimizing Preventive Service of Software Products , 1984, IBM J. Res. Dev..

[7]  Giovanni Denaro,et al.  An empirical evaluation of fault-proneness models , 2002, ICSE '02.

[8]  Elaine J. Weyuker,et al.  The distribution of faults in a large industrial software system , 2002, ISSTA '02.

[9]  Norman E. Fenton,et al.  Quantitative Analysis of Faults and Failures in a Complex Software System , 2000, IEEE Trans. Software Eng..

[10]  Victor R. Basili,et al.  Software errors and complexity: an empirical investigation0 , 1984, CACM.

[11]  Witold Pedrycz,et al.  Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics , 2003, J. Syst. Softw..

[12]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[13]  Bojan Cukic,et al.  Robust prediction of fault-proneness by random forests , 2004, 15th International Symposium on Software Reliability Engineering.

[14]  Lionel C. Briand,et al.  Predicting fault-prone components in a java legacy system , 2006, ISESE '06.

[15]  Eugene Miya,et al.  On "Software engineering" , 1985, SOEN.

[16]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[17]  Elaine J. Weyuker,et al.  Looking for bugs in all the right places , 2006, ISSTA '06.

[18]  Taghi M. Khoshgoftaar,et al.  Using regression trees to classify fault-prone software modules , 2002, IEEE Trans. Reliab..

[19]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[20]  Daniel J. Paulish,et al.  An empirical investigation of software fault distribution , 1993, [1993] Proceedings First International Software Metrics Symposium.

[21]  Taghi M. Khoshgoftaar,et al.  The Detection of Fault-Prone Programs , 1992, IEEE Trans. Software Eng..

[22]  Taghi M. Khoshgoftaar,et al.  Early Quality Prediction: A Case Study in Telecommunications , 1996, IEEE Softw..

[23]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[24]  Audris Mockus,et al.  Does Code Decay? Assessing the Evidence from Change Management Data , 2001, IEEE Trans. Software Eng..