Performance Evaluation of Data Mining Techniques for Predicting Software Reliability

Abstract—Accurate software reliability prediction not only enables developers to improve the quality of software but also provides useful information to help them for planning valuable resources. This paper examines the performance of three well-known data mining techniques (CART, TreeNet and Random Forest) for predicting software reliability. We evaluate and compare the performance of proposed models with Cascade Correlation Neural Network (CCNN) using sixteen empirical databases from the Data and Analysis Center for Software. The goal of our study is to help project managers to concentrate their testing efforts to minimize the software failures in order to improve the reliability of the software systems. Two performance measures, Normalized Root Mean Squared Error (NRMSE) and Mean Absolute Errors (MAE), illustrate that CART model is accurate than the models predicted using Random Forest, TreeNet and CCNN in all datasets used in our study. Finally, we conclude that such methods can help in reliability prediction using real-life failure datasets.

[1]  Yogesh Singh,et al.  Application of feed-forward neural networks for software reliability prediction , 2010, SOEN.

[2]  John D. Musa,et al.  Software Reliability Engineering: More Reliable Software Faster and Cheaper , 2004 .

[3]  Yogesh Singh,et al.  Determination of Software Release Instant of Three-Tier Client Server Software System , 2010 .

[4]  Yeu-Shiang Huang,et al.  A study of software reliability growth from the perspective of learning effects , 2008, Reliab. Eng. Syst. Saf..

[5]  Vadlamani Ravi,et al.  Software reliability prediction by soft computing techniques , 2008, J. Syst. Softw..

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  Arvinder Kaur,et al.  Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study , 2009 .

[8]  Pradeep Kumar,et al.  Prediction of Software Reliability Using Feed Forward Neural Networks , 2010, 2010 International Conference on Computational Intelligence and Software Engineering.

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[11]  Jun Zheng,et al.  Predicting software reliability with neural network ensembles , 2009, Expert Syst. Appl..

[12]  Michael R. Lyu,et al.  Handbook of software reliability engineering , 1996 .

[13]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[14]  R. Sitte Comparison of software-reliability-growth predictions: neural networks vs parametric-recalibration , 1999 .

[15]  Arvinder Kaur,et al.  Comparative analysis of regression and machine learning methods for predicting fault proneness models , 2009, Int. J. Comput. Appl. Technol..

[16]  Arvinder Kaur,et al.  Application of support vector machine to predict fault prone classes , 2009, SOEN.

[17]  L. Darrell Whitley,et al.  Prediction of Software Reliability Using Connectionist Models , 1992, IEEE Trans. Software Eng..

[18]  Pradeep Kumar,et al.  A SOFTWARE RELIABILITY GROWTH MODEL FOR THREE-TIER CLIENT SERVER SYSTEM , 2010 .

[19]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[20]  Thong Ngee Goh,et al.  A study of the connectionist models for software reliability prediction , 2003 .

[21]  Kimito Funatsu,et al.  Knowledge-Oriented Applications in Data Mining , 2011 .