Empirical Assessment of Machine Learning Based Software Defect Prediction Techniques

Automated reliability assessment is essential for systems that entail dynamic adaptation based on runtime mission-specific requirements. One approach along this direction is to monitor and assess the system using machine learning-based software defect prediction techniques. Due to the dynamic nature of software data collected, Instance-based learning algorithms are proposed for the above purposes. To evaluate the accuracy of these methods, the paper presents an empirical analysis of four different real-time software defect data sets using different predictor models. The results show that a combination of 1R and Instance-based learning along with Consistency-based subset evaluation technique provides a relatively better consistency in achieving accurate predictions as compared with other models. No direct relationship is observed between the skewness present in the data sets and the prediction accuracy of these models. Principal Component Analysis (PCA) does not show a consistent advantage in improving the...