A novel hybrid feature selection method based on dynamic feature importance

Abstract Feature selection aims to eliminate unimportant and redundant features or to select effective and interacting features. It is a challenging task to accurately measure the relationships of candidate features, the selected features and categories in the selection process, especially for high-dimensional and small-sample-size data. To this end, a new measure named Dynamic Feature Importance (DFI) is proposed, as well as its corresponding feature selection algorithm named Dynamic Feature Importance based Feature Selection (DFIFS). In order to obtain higher classification accuracy with smaller number of features, a newly Modified-Dynamic Feature Importance based Feature Selection (M-DFIFS) algorithm is developed by combining DFIFS with classical filters. Based on experiments with 14 public high-dimensional datasets, the lately M-DFIFS algorithm shows significantly better performance than five typical filter algorithms in terms of their average accuracy with acceptable computing time. When using random forest as the classifier, M-DFIFS brings a great advantage in the number decrease of selected features. Hence the new feature selection framework “Filter + DFIFS” is verified very effective to solve problems of obtaining high accuracy with a few features.

[1]  BoulesteixAnne-Laure,et al.  Random forest for ordinal responses , 2016 .

[2]  Driss Aboutajdine,et al.  A two-stage gene selection scheme utilizing MRMR filter and GA wrapper , 2011, Knowledge and Information Systems.

[3]  Seyed Mohammad Mirjalili,et al.  Whale optimization approaches for wrapper feature selection , 2018, Appl. Soft Comput..

[4]  Ping Zhang,et al.  Class-specific mutual information variation for feature selection , 2018, Pattern Recognit..

[5]  C. Ding,et al.  Gene selection algorithm by combining reliefF and mRMR , 2008, BMC Genomics.

[6]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[9]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[10]  Gerhard Tutz,et al.  Random forest for ordinal responses: Prediction and variable selection , 2016, Comput. Stat. Data Anal..

[11]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[12]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[13]  David Zhang,et al.  Feature selection and analysis on correlated gas sensor data with recursive feature elimination , 2015 .

[14]  Jian-Bo Yang,et al.  Feature Selection Using Probabilistic Prediction of Support Vector Regression , 2011, IEEE Transactions on Neural Networks.

[15]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[16]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Dahua Lin,et al.  Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion , 2006, ECCV.

[19]  Jie Zhao,et al.  An effective gas sensor array optimization method based on dynamic feature importance , 2020 .

[20]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[21]  Guifa Teng,et al.  A hybrid multiple feature construction approach for classification using Genetic Programming , 2019, Appl. Soft Comput..

[22]  Vinod Kumar Jain,et al.  Correlation feature selection based improved-Binary Particle Swarm Optimization for gene selection and cancer classification , 2018, Appl. Soft Comput..

[23]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[24]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[25]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Azuraliza Abu Bakar,et al.  Hybrid feature selection based on enhanced genetic algorithm for text categorization , 2016, Expert Syst. Appl..

[27]  L. Teyssier Analytical Classification of Singular Saddle-Node Vector Fields , 2004 .

[28]  Kangfeng Zheng,et al.  Feature selection method with joint maximal information entropy between features and class , 2018, Pattern Recognit..

[29]  Philip A. Chou,et al.  Optimal Partitioning for Classification and Regression Trees , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[31]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[32]  Rui Zhang,et al.  A novel feature selection method considering feature interaction , 2015, Pattern Recognit..

[33]  Cheng Wang,et al.  A filter feature selection method based on the Maximal Information Coefficient and Gram-Schmidt Orthogonalization for biomedical data mining , 2017, Comput. Biol. Medicine.

[34]  J. Kinney,et al.  Equitability, mutual information, and the maximal information coefficient , 2013, Proceedings of the National Academy of Sciences.

[36]  Bo Tang,et al.  Semisupervised Feature Selection Based on Relevance and Redundancy Criteria , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Clive Roberts,et al.  Maximal Information Coefficient-Based Two-Stage Feature Selection Method for Railway Condition Monitoring , 2019, IEEE Transactions on Intelligent Transportation Systems.

[38]  Bo Tang,et al.  Probability Density Function Estimation Using the EEF With Application to Subset/Feature Selection , 2016, IEEE Transactions on Signal Processing.

[39]  Beat Pfister,et al.  A Semidefinite Programming Based Search Strategy for Feature Selection with Mutual Information Measure , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[41]  J.C. Rajapakse,et al.  SVM-RFE With MRMR Filter for Gene Selection , 2010, IEEE Transactions on NanoBioscience.

[42]  C. A. Murthy,et al.  Unsupervised Feature Selection Using Feature Similarity , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..