Genetic programming model for software quality classification

We apply genetic programming techniques to build a software quality classification model based on the metrics of software modules. The model we built attempts to distinguish the fault-prone modules from non-fault-prone modules using genetic programming (GP). These GP experiments were conducted with a random subset selection for GP in order to avoid overfitting. We then use the whole fit data set as the validation data set to select the best model. We demonstrate through two case studies that the GP technique can achieve good results. Also, we compared GP modeling with logistic regression modeling to verify the usefulness of GP.