Software Defect Prediction Scheme Based on Feature Selection

Predicting defect-prone software modules accurately and effectively are important ways to control the quality of a software system during software development. Feature selection can highly improve the accuracy and efficiency of the software defect prediction model. The main purpose of this paper is to discuss the best size of feature subset for building a prediction model and prove that feature selection method is useful for establishing software defect prediction model. Mutual information is an outstanding indicator of relevance between variables, and it has been used as a measurement in our feature selection algorithm. We also introduce a nonlinear factor to our evaluation function for feature selection to improve its performance. The results of our feature selection algorithm are validated by different machine learning methods. The experiment results show that all the classifiers achieve higher accuracy by using the feature subset provided by our algorithm.