Software Fault Proneness Prediction with Group Lasso Regression: On Factors that Affect Classification Performance

Machine learning algorithms have been used extensively for software fault proneness prediction. This paper presents the first application of Group Lasso Regression (G-Lasso) for software fault proneness classification and compares its performance to six widely used machine learning algorithms. Furthermore, we explore the effects of two factors on the prediction performance: the effect of imbalance treatment using the Synthetic Minority Over-sampling Technique (SMOTE), and the effect of datasets used in building the prediction models. Our experimental results are based on 22 datasets extracted from open source projects. The main findings include: (1) G-Lasso is robust to imbalanced data and significantly outperforms the other machine learning algorithms with respect to the Recall and G-Score, i.e., the harmonic mean of Recall and (1- False Positive Rate). (2) Even though SMOTE improved the performance of all learners, it did not have statistically significant effect on G-Lasso's Recall and G-Score. Random Forest was in the top performing group of learners for all performance metrics, while Naive Bayes performed the worst of all learners. (3) When using the same change metrics as features, the choice of the dataset had no effect on the performance of most learners, including G-Lasso. Naive Bayes was the most affected, especially when balanced datasets were used.

[1]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[2]  Anis Yazidi,et al.  Automatic Security Classification with Lasso , 2015, WISA.

[3]  Tim Menzies,et al.  Is "Better Data" Better Than "Better Data Miners"? , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[4]  Akito Monden,et al.  MAHAKIL: Diversity Based Oversampling Approach to Alleviate the Class Imbalance Issue in Software Defect Prediction , 2018, IEEE Trans. Software Eng..

[5]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[6]  Wushao Wen,et al.  Ridge and Lasso Regression Models for Cross-Version Defect Prediction , 2018, IEEE Transactions on Reliability.

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Xin Yao,et al.  Using Class Imbalance Learning for Software Defect Prediction , 2013, IEEE Transactions on Reliability.

[9]  Lionel C. Briand,et al.  A systematic and comprehensive investigation of methods to build and evaluate fault prediction models , 2010, J. Syst. Softw..

[10]  Akito Monden,et al.  On the relative value of data resampling approaches for software defect prediction , 2018, Empirical Software Engineering.

[11]  Bart Baesens,et al.  Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers , 2013, IEEE Transactions on Software Engineering.

[12]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[13]  Banu Diri,et al.  Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem , 2009, Inf. Sci..

[14]  Fei Wang,et al.  Empirical Study on the Correlation Between Software Structural Modifications and Its Fault-Proneness , 2017, 2017 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C).

[15]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[16]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[17]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[18]  Ahmed E. Hassan,et al.  The Impact of Class Rebalancing Techniques on the Performance and Interpretation of Defect Prediction Models , 2018, IEEE Transactions on Software Engineering.

[19]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[20]  Robyn R. Lutz,et al.  Predicting failure-proneness in an evolving software product line , 2013, Inf. Softw. Technol..

[21]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..

[22]  José Javier Dolado,et al.  Preliminary comparison of techniques for dealing with imbalance in software defect prediction , 2014, EASE '14.

[23]  The application of ROC analysis in threshold identification, data imbalance and metrics selection for software fault prediction , 2017, Innovations in Systems and Software Engineering.

[24]  Akito Monden,et al.  The Significant Effects of Data Sampling Approaches on Software Defect Prioritization and Classification , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[25]  Xiaoyuan Jing,et al.  Multiple kernel ensemble learning for software defect prediction , 2015, Automated Software Engineering.

[26]  Bojan Cukic,et al.  Robust prediction of fault-proneness by random forests , 2004, 15th International Symposium on Software Reliability Engineering.

[27]  Taghi M. Khoshgoftaar,et al.  Attribute Selection and Imbalanced Data: Problems in Software Defect Prediction , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[28]  C. Manjula,et al.  Deep neural network based hybrid approach for software defect prediction using software metrics , 2018, Cluster Computing.

[29]  Hongfang Liu,et al.  An investigation of the effect of module size on defect prediction using static measures , 2005, ACM SIGSOFT Softw. Eng. Notes.

[30]  Md Zahidul Islam,et al.  Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem , 2015, Inf. Syst..

[31]  Licheng Jiao,et al.  Semi-Supervised Deep Fuzzy C-Mean Clustering for Software Fault Prediction , 2018, IEEE Access.

[32]  Tim Menzies,et al.  Problems with Precision , 2007 .

[33]  Akito Monden,et al.  The Effects of Over and Under Sampling on Fault-prone Module Detection , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[34]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[35]  Robyn R. Lutz,et al.  Are change metrics good predictors for an evolving software product line? , 2011, Promise '11.

[36]  Shane McIntosh,et al.  Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[37]  Ömer Faruk Arar,et al.  Software defect prediction using cost-sensitive neural network , 2015, Appl. Soft Comput..

[38]  Tracy Hall,et al.  Reproducibility and replicability of software defect prediction studies , 2018, Inf. Softw. Technol..

[39]  Kuldeep Kumar,et al.  Empirical analysis of change metrics for software fault prediction , 2018, Comput. Electr. Eng..

[40]  Andreas Zeller,et al.  Change Bursts as Defect Predictors , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[41]  Robyn R. Lutz,et al.  Assessment and cross-product prediction of software product line quality: accounting for reuse across products, over multiple releases , 2014, Automated Software Engineering.