Software Defect Prediction Using Supervised Learning Algorithm and Unsupervised Learning Algorithm

Software defect prediction has recently attracted attention of many software quality researchers. One of the major areas in current project management software is to effectively utilize resources to make meaningful impact on time and cost. A pragmatic assessment of metrics is essential in order to comprehend the quality of software and to ensure corrective measures. Software defect prediction methods are majorly used to study the impact areas in software using different techniques which comprises of neural network (NN) techniques, clustering techniques, statistical method and machine learning methods. These techniques of Data mining are applied in building software defect prediction models which improve the software quality. The aim of this paper is to propose various classification and clustering methods with an objective to predict software defect. To predict software defect we analyzed classification and clustering techniques. The performance of three data mining classifier algorithms named J48, Random Forest, and Naive Bayesian Classifier (NBC) are evaluated based on various criteria like ROC, Precision, MAE, RAE etc. Clustering technique is then applied on the data set using k-means, Hierarchical Clustering and Make Density Based Clustering algorithm. Evaluation of results for clustering is based on criteria like Time Taken, Cluster Instance, Number of Iterations, Incorrectly Clustered Instance and Log Likelihood etc. A thorough exploration of ten real time defect datasets of NASA[1] software project, followed by various applications on them finally results in defect prediction.