Software Defect Prediction Using Random Forest Algorithm

The software defect can cause the unnecessary effects on the software such as cost and quality. The prediction of the software defect can be useful for the development of good quality software. For the prediction, the PROMISE public dataset will be used and random forest (RF) algorithm will be applied with the RAPIDMINER machine learning tool. This paper will compare the performance evaluation upon the different number of trees in RF. As the results, the accuracy will be slightly increased if the number of trees will be more. The maximum accuracy is up to 99.59 and the minimum accuracy is 85.96. Another comparison is based on AUC curve that represents the most informative indicator of predictive accuracy within the field of software defect prediction. All of the results show that RF algorithm is effective in this prediction which is more suitable with the usage of hundred trees in the RF.