A Review of Feature Selection and Its Methods

Abstract Nowadays, being in digital era the data generated by various applications are increasing drastically both row-wise and column wise; this creates a bottleneck for analytics and also increases the burden of machine learning algorithms that work for pattern recognition. This cause of dimensionality can be handled through reduction techniques. The Dimensionality Reduction (DR) can be handled in two ways namely Feature Selection (FS) and Feature Extraction (FE). This paper focuses on a survey of feature selection methods, from this extensive survey we can conclude that most of the FS methods use static data. However, after the emergence of IoT and web-based applications, the data are generated dynamically and grow in a fast rate, so it is likely to have noisy data, it also hinders the performance of the algorithm. With the increase in the size of the data set, the scalability of the FS methods becomes jeopardized. So the existing DR algorithms do not address the issues with the dynamic data. Using FS methods not only reduces the burden of the data but also avoids overfitting of the model.

[1]  Bor-Chen Kuo,et al.  Feature Mining for Hyperspectral Image Classification , 2013, Proceedings of the IEEE.

[2]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[3]  Nicu Sebe,et al.  Discriminating Joint Feature Analysis for Multimedia Data Understanding , 2012, IEEE Transactions on Multimedia.

[4]  Murat Can Ganiz,et al.  Helmholtz principle based supervised and unsupervised feature selection methods for text mining , 2016, Inf. Process. Manag..

[5]  Xiangjian He,et al.  Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm , 2016, IEEE Transactions on Computers.

[6]  Leslie S. Smith,et al.  Feature subset selection in large dimensionality domains , 2010, Pattern Recognit..

[7]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[8]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[9]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[10]  David Zhang,et al.  Feature selection and analysis on correlated gas sensor data with recursive feature elimination , 2015 .

[11]  Ahmed Al-Ani,et al.  Feature Subset Selection Using Ant Colony Optimization , 2008 .

[12]  Xiao Wang,et al.  Unsupervised feature selection via Diversity-induced Self-representation , 2017, Neurocomputing.

[13]  Yeng Chai Soh,et al.  Occupancy estimation from environmental parameters using wrapper and hybrid feature selection , 2017, Appl. Soft Comput..

[14]  Ching Y. Suen,et al.  Analysis of Class Separation and Combination of Class-Dependent Features for Handwriting Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[17]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[18]  Jiasong Zhu,et al.  Discriminative Gabor Feature Selection for Hyperspectral Image Classification , 2013, IEEE Geoscience and Remote Sensing Letters.

[19]  Jin-Kao Hao,et al.  A Hybrid GA/SVM Approach for Gene Selection and Classification of Microarray Data , 2006, EvoWorkshops.

[20]  Francesc J. Ferri,et al.  Comparative study of techniques for large-scale feature selection* *This work was suported by a SERC grant GR/E 97549. The first author was also supported by a FPI grant from the Spanish MEC, PF92 73546684 , 1994 .

[21]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[22]  Wlodzislaw Duch,et al.  Feature Selection for High-Dimensional Data - A Pearson Redundancy Based Filter , 2008, Computer Recognition Systems 2.

[23]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[24]  Zhihui Lai,et al.  The L2, 1-norm-based unsupervised optimal feature selection with applications to action recognition , 2016, Pattern Recognit..

[25]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[26]  Guoqiang Hu,et al.  Optimal Sensor Configuration and Feature Selection for AHU Fault Detection and Diagnosis , 2017, IEEE Transactions on Industrial Informatics.

[27]  Yudong Zhang,et al.  Detection of subjects and brain regions related to Alzheimer's disease using 3D MRI scans based on eigenbrain and machine learning , 2015, Front. Comput. Neurosci..

[28]  Li-Yeh Chuang,et al.  Tabu Search and Binary Particle Swarm Optimization for Feature Selection Using Microarray Data , 2009, J. Comput. Biol..

[29]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Marco Cristani,et al.  Infinite Feature Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  John Q. Gan,et al.  A supervised filter method for multi-objective feature selection in EEG classification based on multi-resolution analysis for BCI , 2017, Neurocomputing.

[32]  Sergio Bermejo,et al.  Ensembles of wrappers for automated feature selection in fish age classification , 2017, Comput. Electron. Agric..

[33]  Gianluca Bontempi,et al.  On the Use of Variable Complementarity for Feature Selection in Cancer Classification , 2006, EvoWorkshops.

[34]  Jean-Michel Morel,et al.  From Gestalt Theory to Image Analysis: A Probabilistic Approach , 2007 .

[35]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[37]  Ramesh C. Jain,et al.  Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images , 2011, TIST.

[38]  Mita Nasipuri,et al.  A Harmony Search Based Wrapper Feature Selection Method for Holistic Bangla word Recognition , 2017, ArXiv.

[39]  J. Rodgers,et al.  Thirteen ways to look at the correlation coefficient , 1988 .

[40]  Usama M. Fayyad,et al.  The Attribute Selection Problem in Decision Tree Generation , 1992, AAAI.

[41]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[42]  A. Land,et al.  An Automatic Method for Solving Discrete Programming Problems , 1960, 50 Years of Integer Programming.

[43]  Gang Chen,et al.  A novel wrapper method for feature selection and its applications , 2015, Neurocomputing.

[44]  Gilles Brassard,et al.  Fundamentals of Algorithmics , 1995 .

[45]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[46]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[47]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[48]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[49]  George K. Matsopoulos,et al.  A classification system based on a new wrapper feature selection algorithm for the diagnosis of primary and secondary polycythemia , 2013, Comput. Biol. Medicine.

[50]  Fred W. Glover,et al.  Future paths for integer programming and links to artificial intelligence , 1986, Comput. Oper. Res..

[51]  José Luis Rojo-Álvarez,et al.  Detection of Life-Threatening Arrhythmias Using Feature Selection and Support Vector Machines , 2014, IEEE Transactions on Biomedical Engineering.

[52]  Anongnart Srivihok,et al.  Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm , 2015 .

[53]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[54]  Kazuyuki Aihara,et al.  Chaotic simulated annealing by a neural network model with transient chaos , 1995, Neural Networks.

[55]  Li Guo,et al.  Survey and Taxonomy of Feature Selection Algorithms in Intrusion Detection System , 2006, Inscrypt.

[56]  D. Asir Antony Gnana Singh,et al.  Literature Review on Feature Selection Methods for High-Dimensional Data , 2016 .

[57]  Justin Doak,et al.  CSE-92-18 - An Evaluation of Feature Selection Methodsand Their Application to Computer Security , 1992 .

[58]  Yide Ma,et al.  Robust unsupervised feature selection via matrix factorization , 2017, Neurocomputing.

[59]  Luis Talavera,et al.  Feature Selection as a Preprocessing Step for Hierarchical Clustering , 1999, ICML.

[60]  Yudong Zhang,et al.  Binary PSO with mutation operator for feature selection using decision tree applied to spam detection , 2014, Knowl. Based Syst..

[61]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[62]  Mikhail Belkin,et al.  Towards a theoretical foundation for Laplacian-based manifold methods , 2005, J. Comput. Syst. Sci..

[63]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[64]  Gil Alterovitz,et al.  Wrapper-based gene selection with Markov blanket , 2017, Comput. Biol. Medicine.

[65]  Qinbao Song,et al.  A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[66]  Yalda Mohsenzadeh,et al.  Variational Relevant Sample-Feature Machine: A fully Bayesian approach for embedded feature selection , 2017, Neurocomputing.

[67]  Lipo Wang,et al.  Ant Colony Optimization for the Traveling Salesman Problem Based on Ants with Memory , 2008, 2008 Fourth International Conference on Natural Computation.

[68]  Habibollah Haron,et al.  Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[69]  Jianzhong Wang,et al.  Unsupervised feature selection by regularized matrix factorization , 2018, Neurocomputing.

[70]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[71]  Ujjwal Maulik,et al.  Integration of dense subgraph finding with feature clustering for unsupervised feature selection , 2014, Pattern Recognit. Lett..

[72]  Ya-Feng Liu,et al.  LLE Score: A New Filter-Based Unsupervised Feature Selection Method Based on Nonlinear Manifold Embedding and Its Application to Image Recognition , 2017, IEEE Transactions on Image Processing.

[73]  John E. Moody,et al.  Data Visualization and Feature Selection: New Algorithms for Nongaussian Data , 1999, NIPS.

[74]  Simone Melzi,et al.  Feature Selection via Eigenvector Centrality , 2016 .

[75]  Steven J. Simske,et al.  On the Helmholtz Principle for Data Mining , 2012, 2012 Third International Conference on Emerging Security Technologies.

[76]  Huan Liu,et al.  Semi-supervised Feature Selection via Spectral Analysis , 2007, SDM.

[77]  Jin-Kao Hao,et al.  A memetic algorithm for gene selection and molecular classification of cancer , 2009, GECCO.

[78]  Bruce A. Draper,et al.  Feature selection from huge feature sets , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[79]  Ivor W. Tsang,et al.  Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering , 2011, IEEE Transactions on Neural Networks.

[80]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[81]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[82]  Ron Shamir,et al.  SlimPLS: A Method for Feature Selection in Gene Expression-Based Disease Classification , 2009, PloS one.

[83]  Xiaofeng Zhu,et al.  Graph self-representation method for unsupervised feature selection , 2017, Neurocomputing.

[84]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[85]  Nasser Yazdani,et al.  Mutual information-based feature selection for intrusion detection systems , 2011, J. Netw. Comput. Appl..

[86]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[87]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[88]  Yalda Mohsenzadeh,et al.  The Relevance Sample-Feature Machine: A Sparse Bayesian Learning Approach to Joint Feature-Sample Selection , 2013, IEEE Transactions on Cybernetics.

[89]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[90]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[91]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[92]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[93]  Young-Koo Lee,et al.  An Improved Maximum Relevance and Minimum Redundancy Feature Selection Algorithm Based on Normalized Mutual Information , 2010, 2010 10th IEEE/IPSJ International Symposium on Applications and the Internet.

[94]  Simon C. K. Shiu,et al.  Unsupervised feature selection by regularized self-representation , 2015, Pattern Recognit..

[95]  Yi Yang,et al.  Web and Personal Image Annotation by Mining Label Correlation With Relaxed Visual Graph Embedding , 2012, IEEE Transactions on Image Processing.

[96]  Jack Sklansky,et al.  On Automatic Feature Selection , 1988, Int. J. Pattern Recognit. Artif. Intell..

[97]  Chaouki Khammassi,et al.  A GA-LR wrapper approach for feature selection in network intrusion detection , 2017, Comput. Secur..

[98]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[99]  Michael E. Tipping Sparse Bayesian Learning and the Relevance Vector Machine , 2001, J. Mach. Learn. Res..

[100]  P. Pudil,et al.  of Techniques for Large-Scale Feature Selection , 1994 .

[101]  Anirban Mukhopadhyay,et al.  An Improved Minimum Redundancy Maximum Relevance Approach for Feature Selection in Gene Expression Data , 2013 .

[102]  Han Wang,et al.  Unsupervised feature selection via low-rank approximation and structure learning , 2017, Knowl. Based Syst..

[103]  Shih-Fu Chang,et al.  Graph transduction via alternating minimization , 2008, ICML '08.

[104]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[105]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[106]  Xin Jin,et al.  Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles , 2006, BioDM.

[107]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[108]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[109]  Yonghua Zhu,et al.  Adaptive structure learning for low-rank supervised feature selection , 2017, Pattern Recognit. Lett..

[110]  Lipo Wang,et al.  A noisy chaotic neural network for solving combinatorial optimization problems: stochastic chaotic simulated annealing , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[111]  LinLin Shen,et al.  Hyperspectral image classification using Fisher criterion-based Gabor cube selection and multi-task joint sparse representation , 2015, 2015 7th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS).

[112]  Sanmay Das,et al.  Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection , 2001, ICML.

[113]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[114]  J. Stuart Aitken,et al.  Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes , 2005, BMC Bioinformatics.

[115]  Shutao Li,et al.  Gene Selection Using Wilcoxon Rank Sum Test and Support Vector Machine for Cancer Classification , 2007, CIS.

[116]  Jian Zhang,et al.  Unsupervised spectral feature selection with l1-norm graph , 2016, Neurocomputing.