An Empirical Study on the Equivalence and Stability of Feature Selection for Noisy Software Defect Data

Abstract—Software Defect Data (SDD) are used to build defect prediction models for software quality assurance. Existing work employs feature selection to eliminate irrelevant features in the data to improve prediction performance. Previous studies have shown that different feature selection methods do not always yield similar prediction performance on SDD, which indicates that these methods are not equivalent. Also, previous studies have shown that SDD usually contains noise that may interfere the process of feature selection. In this work, we empirically investigate and measure the equivalence of different feature selection methods for SDD. Further, we intend to analyze the stability of the methods for noisy SDD. We perform statistical analyses on eight projects from NASA dataset with eight feature selection methods. For the equivalence analysis, we introduce Principal Component Analysis (PCA) and overlap index to qualitatively and quantitatively analyze the equivalence of these methods respectively. For the stability analysis, we apply consistency index to measure the stability of these methods. Experimental results indicate that different feature selection methods are indeed not equivalent to each other, and Correlation and Fisher Score methods achieve better stability. Keywords—defect data; feature selection; equivalence analysis; stability analysis;

[1]  Qinbao Song,et al.  A General Software Defect-Proneness Prediction Framework , 2011, IEEE Transactions on Software Engineering.

[2]  Shane McIntosh,et al.  Automated Parameter Optimization of Classification Techniques for Defect Prediction Models , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[3]  Tim Menzies,et al.  Heterogeneous Defect Prediction , 2018, IEEE Trans. Software Eng..

[4]  Alessandro Orso,et al.  Are automated debugging techniques actually helping programmers? , 2011, ISSTA '11.

[5]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[6]  Jin Liu,et al.  The Impact of Feature Selection on Defect Prediction Performance: An Empirical Comparison , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[7]  Vinod Yadava,et al.  Multi-objective optimization of Nd:YAG laser cutting of nickel-based superalloy sheet using orthogonal array with principal component analysis , 2008 .

[8]  Sunghun Kim,et al.  Reducing Features to Improve Code Change-Based Bug Prediction , 2013, IEEE Transactions on Software Engineering.

[9]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[10]  Taghi M. Khoshgoftaar,et al.  A novel dataset-similarity-aware approach for evaluating stability of software metric selection techniques , 2012, 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI).

[11]  Xiao Liu,et al.  An empirical study on software defect prediction with a simplified metric set , 2014, Inf. Softw. Technol..

[12]  Taghi M. Khoshgoftaar,et al.  A Study on First Order Statistics-Based Feature Selection Techniques on Software Metric Data , 2013, International Conference on Software Engineering and Knowledge Engineering.

[13]  Andrea De Lucia,et al.  Cross-project defect prediction models: L'Union fait la force , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[14]  Qinbao Song,et al.  Data Quality: Some Comments on the NASA Software Defect Datasets , 2013, IEEE Transactions on Software Engineering.

[15]  Taghi M. Khoshgoftaar,et al.  Impact of Data Sampling on Stability of Feature Selection for Software Measurement Data , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[16]  Taghi M. Khoshgoftaar,et al.  A survey of stability analysis of feature subset selection techniques , 2013, 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI).

[17]  Atul Gupta,et al.  A comparative study of feature-ranking and feature-subset selection techniques for improved fault prediction , 2014, ISEC '14.

[18]  Oral Alan,et al.  Class noise detection based on software metrics and ROC curves , 2011, Inf. Sci..

[19]  Taghi M. Khoshgoftaar,et al.  Measuring robustness of Feature Selection techniques on software engineering datasets , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[20]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[21]  Melanie Hilario,et al.  Stability of feature selection algorithms , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[22]  Taghi M. Khoshgoftaar,et al.  Knowledge discovery from imbalanced and noisy data , 2009, Data Knowl. Eng..

[23]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[24]  R. Real,et al.  The Probabilistic Basis of Jaccard's Index of Similarity , 1996 .

[25]  Taghi M. Khoshgoftaar,et al.  Evolutionary Optimization of Software Quality Modeling with Multiple Repositories , 2010, IEEE Transactions on Software Engineering.

[26]  E. James Whitehead,et al.  Efficient bug prediction and fix suggestions , 2013 .

[27]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Feature Ranking Techniques for Software Quality Prediction , 2012, Int. J. Softw. Eng. Knowl. Eng..

[28]  Taghi M. Khoshgoftaar,et al.  Choosing software metrics for defect prediction: an investigation on feature selection techniques , 2011, Softw. Pract. Exp..

[29]  Huan Liu,et al.  A Dilemma in Assessing Stability of Feature Selection Algorithms , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.