A new ensemble feature selection approach based on genetic algorithm

In the ensemble feature selection method, if the weight adjustment is performed on each feature subset used, the ensemble effect can be significantly different; therefore, how to find the optimized weight vector is a key and challenging problem. Aiming at this optimization problem, this paper proposes an ensemble feature selection approach based on genetic algorithm (EFS-BGA). After each base feature selector generates a feature subset, the EFS-BGA method obtains the optimized weight of each feature subset through genetic algorithm, which is different from traditional genetic algorithm directly processing single features. We divide the EFS-BGA algorithm into two types. The first is a complete ensemble feature selection method; based on the first, we further propose the selective EFS-BGA model. After that, through mathematical analysis, we theoretically explain why weight adjustment is an optimization problem and how to optimize. Finally, through the comparative experiments on multiple data sets, the advantages of the EFS-BGA algorithm in this paper over the previous ensemble feature selection algorithms are explained in practice.

[1]  Gang Qu,et al.  Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department , 2017, Expert Syst. Appl..

[2]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[3]  R.G. Baraniuk,et al.  Compressive Sensing [Lecture Notes] , 2007, IEEE Signal Processing Magazine.

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[6]  Jie Wang,et al.  Feature selection of steel surface defect based on P-ReliefF method , 2016, CCC 2016.

[7]  Huan Liu,et al.  Feature Selection and Classification - A Probabilistic Wrapper Approach , 1996, IEA/AIE.

[8]  Seyed Taghi Akhavan Niaki,et al.  Optimising multi-item economic production quantity model with trapezoidal fuzzy demand and backordering: two tuned meta-heuristics , 2016 .

[9]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[10]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[11]  Sunanda Das,et al.  Ensemble feature selection using bi-objective genetic algorithm , 2017, Knowl. Based Syst..

[12]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[13]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[14]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[15]  Jon Atli Benediktsson,et al.  Feature Selection Based on Hybridization of Genetic Algorithm and Particle Swarm Optimization , 2015, IEEE Geoscience and Remote Sensing Letters.

[16]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[17]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[18]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[19]  Gilles Louppe,et al.  Independent consultant , 2013 .

[20]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[21]  Lawrence Mitchell,et al.  Parallel classification and feature selection in microarray data using SPRINT , 2014, Concurr. Comput. Pract. Exp..

[22]  Marc Parizeau,et al.  DEAP: evolutionary algorithms made easy , 2012, J. Mach. Learn. Res..

[23]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[24]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[25]  F. Santosa,et al.  Linear inversion of ban limit reflection seismograms , 1986 .

[26]  C. Willmott,et al.  Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance , 2005 .

[27]  Zhaohui Wu,et al.  A novel ensemble-based wrapper method for feature selection using extreme learning machine and genetic algorithm , 2018, Knowledge and Information Systems.

[28]  Azuraliza Abu Bakar,et al.  Hybrid feature selection based on enhanced genetic algorithm for text categorization , 2016, Expert Syst. Appl..