Improved random forest algorithm based system and method for software fault prediction

The invention discloses improved random forest algorithm based system and method for software fault prediction. The system comprises a data processing layer, a prediction model building layer and a fault predication layer. The method includes calculating a software project attribute set used for acquiring a training model to acquire a training data set of a software prediction model, and performing equalization to the training data set; building a prediction model according to an improved random forest algorithm; screening the model according to performance limiting of accuracy rate and recall ratio; and predicting a software project according to attribute set information of the to-be-predicted software project and a trained prediction model and displaying prediction results and the prediction model. The improved random forest algorithm based system and method for software fault prediction have the advantages of high prediction accuracy rate, performance stability and high execution efficiency, can evaluate whether a final software product reaches specified quality or meets expectation of a user or not, and can guide developers to formulate distribution strategies of software testing and formal verification resources.

[1]  A. Kaur,et al.  Application of Random Forest in Predicting Fault-Prone Classes , 2008, 2008 International Conference on Advanced Computer Theory and Engineering.