Software Defect Prediction Based on Fourier Learning

Modern software systems have grown significantly in their size and complexity, therefore software systems have more and more potential defects. Software defect prediction uses a defect data set to build a predictive model, where the data set is composed of software defect metrics. Then, this predictive model is used to predict potential defect program modules in the project. This paper uses the Fourier expression of Boolean function to build a software defect prediction model. We provide the algorithms to calculate the Fourier coefficients and get the predicted function which can predict software defect. And, we compare the Fourier learning algorithm with the traditional machine learning algorithms, such as the random forest algorithm. Finally, the experiment results show that the Fourier learning algorithm is not only better than other algorithms, but also more stable.

[1]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[2]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[3]  倪超,et al.  Survey of Static Software Defect Prediction , 2016 .

[4]  Sanjay Bhatia,et al.  Analysing Software Metrics for Accurate Dynamic Defect Prediction Models , 2015 .

[5]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[6]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[7]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[8]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[9]  Ryan O'Donnell,et al.  Analysis of Boolean Functions: Some Tips , 2014 .

[10]  Yi Zhang,et al.  Performance Prediction of Configurable Software Systems by Fourier Learning (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[11]  Vincent J. Carey,et al.  Supervised Machine Learning , 2008 .

[12]  Taghi M. Khoshgoftaar,et al.  Choosing software metrics for defect prediction: an investigation on feature selection techniques , 2011, Softw. Pract. Exp..

[13]  Yi Zhang,et al.  A mathematical model of performance-relevant feature interactions , 2016, SPLC.

[14]  Oliver Kuss,et al.  A modified Wald interval for the area under the ROC curve (AUC) in diagnostic case-control studies , 2014, BMC Medical Research Methodology.