FSCR: A Feature Selection Method for Software Defect Prediction

The invention relates to a mixed feature selection method for software defect prediction. The method includes the steps of firstly, selecting m most relevant features from an original feature set, and discarding irrelevant features; secondly, clustering the m features according to the correlation between features, wherein the features with high redundancy between them are clustered into the same cluster; finally, by using ideas from wrapped feature selection, deleting one of the most irrelevant feature in each cluster from a current feature subset to form a new feature subset, and using the evaluation function of accuracy to evaluate the feature subset. Meanwhile, by deleting one of the most irrelevant feature in different clusters to form the new feature subset, the number of searched feature subsets can be effectively reduced.

[1]  Tao Wang,et al.  Naive Bayes Software Defect Prediction Model , 2010, 2010 International Conference on Computational Intelligence and Software Engineering.

[2]  Xiang Chen,et al.  A Two-Stage Data Preprocessing Approach for Software Fault Prediction , 2014, 2014 Eighth International Conference on Software Security and Reliability.

[3]  W. Afzal,et al.  prediction of fault count data using genetic programming , 2008, 2008 IEEE International Multitopic Conference.

[4]  Hui Li,et al.  UCOR: An Unequally Clustering-Based Hierarchical Opportunistic Routing Protocol for WSNs , 2013, WASA.

[5]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[6]  Tong-Seng Quah,et al.  Application of neural networks for software quality prediction using object-oriented metrics , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[7]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[8]  Xin Yao,et al.  A Learning-to-Rank Approach to Software Defect Prediction , 2015, IEEE Transactions on Reliability.

[9]  Jun Wang,et al.  Compressed C4.5 Models for Software Defect Prediction , 2012, 2012 12th International Conference on Quality Software.

[10]  Thomas J. Ostrand,et al.  \{PROMISE\} Repository of empirical software engineering data , 2007 .

[11]  Taghi M. Khoshgoftaar,et al.  Choosing software metrics for defect prediction: an investigation on feature selection techniques , 2011, Softw. Pract. Exp..

[12]  Sandeep Kumar,et al.  Predicting Number of Faults in Software System using Genetic Programming , 2015, SCSE.

[13]  Sandeep Kumar,et al.  An empirical study of some software fault prediction techniques for the number of faults prediction , 2017, Soft Comput..

[14]  Sunghun Kim,et al.  Reducing Features to Improve Bug Prediction , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[16]  SongQinbao,et al.  A General Software Defect-Proneness Prediction Framework , 2011 .

[17]  P. Scalart,et al.  Reliable a posteriori signal-to-noise ratio features selection , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[18]  Delki Abadi,et al.  Perbandingan Algoritme Feature Selection Information Gain dan Symmetrical Uncertainty pada Data Ketahanan Pangan , 2013 .

[19]  Bruce Christianson,et al.  Using the Support Vector Machine as a Classification Method for Software Defect Prediction with Static Code Metrics , 2009, EANN.

[20]  Ping Guo,et al.  Software Defect Prediction Using Fuzzy Support Vector Regression , 2010, ISNN.

[21]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[22]  Sandeep Kumar,et al.  A Decision Tree Regression based Approach for the Number of Software Faults Prediction , 2016, ACM SIGSOFT Softw. Eng. Notes.

[23]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[24]  Tore Dybå,et al.  A systematic review of effect size in software engineering experiments , 2007, Inf. Softw. Technol..

[25]  Santosh Singh Rathore,et al.  Comparative analysis of neural network and genetic programming for number of software faults prediction , 2015, 2015 National Conference on Recent Advances in Electronics & Computer Engineering (RAECE).

[26]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[27]  M. L. Valarmathi,et al.  GAIN RATIO BASED FEATURE SELECTION METHOD FOR PRIVACY PRESERVATION , 2011 .

[28]  Yutao Ma,et al.  An empirical study on predicting defect numbers , 2015, SEKE.

[29]  Xiang Chen,et al.  FECAR: A Feature Selection Framework for Software Defect Prediction , 2014, 2014 IEEE 38th Annual Computer Software and Applications Conference.

[30]  Tracy Hall,et al.  Researcher Bias: The Use of Machine Learning in Software Defect Prediction , 2014, IEEE Transactions on Software Engineering.

[31]  Ruchika Malhotra,et al.  A systematic review of machine learning techniques for software fault prediction , 2015, Appl. Soft Comput..

[32]  Taghi M. Khoshgoftaar,et al.  An empirical investigation of filter attribute selection techniques for software quality classification , 2009, 2009 IEEE International Conference on Information Reuse & Integration.

[33]  Xin Jin,et al.  Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles , 2006, BioDM.