An evolutionary algorithm with the partial sequential forward floating search mutation for large-scale feature selection problems

Several meta-heuristic algorithms, such as evolutionary algorithms (EAs) and genetic algorithms (GAs), have been developed for solving feature selection problems due to their efficiency for searching feature subset spaces in feature selection problems. Recently, hybrid GAs have been proposed to improve the performance of conventional GAs by embedding a local search operation, or sequential forward floating search mutation, into the GA. Existing hybrid algorithms may damage individuals’ genetic information obtained from genetic operations during the local improvement procedure because of a sequential process of the mutation operation and the local improvement operation. Another issue with a local search operation used in the existing hybrid algorithms is its inappropriateness for large-scale problems. Therefore, we propose a novel approach for solving large-sized feature selection problems, namely, an EA with a partial sequential forward floating search mutation (EAwPS). The proposed approach integrates a local search technique, that is, the partial sequential forward floating search mutation into an EA method. Two algorithms, EAwPS-binary representation (EAwPS-BR) for medium-sized problems and EAwPS-integer representation (EAwPS-IR) for large-sized problems, have been developed. The adaptation of a local improvement method into the EA speeds up the search and directs the search into promising solution areas. We compare the performance of the proposed algorithms with other popular meta-heuristic algorithms using the medium- and large-sized data sets. Experimental results demonstrate that the proposed EAwPS extracts better features within reasonable computational times.

[1]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[3]  Pablo Moscato,et al.  Evolutionary Search of Thresholds for Robust Feature Set Selection: Application to the Analysis of Microarray Data , 2004, EvoWorkshops.

[4]  Yeongho Kim,et al.  A Coevolutionary Algorithm for Balancing and Sequencing in Mixed Model Assembly Lines , 2000, Applied Intelligence.

[5]  Thomas Bäck,et al.  Evolutionary computation: comments on the history and current state , 1997, IEEE Trans. Evol. Comput..

[6]  Belén Melián-Batista,et al.  Solving feature subset selection problem by a Parallel Scatter Search , 2006, Eur. J. Oper. Res..

[7]  Konstantinos Falangis,et al.  Heuristics for feature selection in mathematical programming discriminant analysis models , 2010, J. Oper. Res. Soc..

[8]  Y. Liu,et al.  Data mining feature selection for credit scoring models , 2005, J. Oper. Res. Soc..

[9]  Seong G. Kong,et al.  Band Selection of Hyperspectral Images for Automatic Detection of Poultry Skin Tumors , 2007, IEEE Transactions on Automation Science and Engineering.

[10]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Silvia Casado Yusta,et al.  Different metaheuristic strategies to solve the feature selection problem , 2009, Pattern Recognit. Lett..

[12]  Mads Thomassen,et al.  Evolutionary Algorithm for Feature Subset Selection in Predicting Tumor Outcomes Using Microarray Data , 2008, ISBRA.

[13]  Lakhmi C. Jain,et al.  Nearest neighbor classifier: Simultaneous editing and feature selection , 1999, Pattern Recognit. Lett..

[14]  Paulien Hogeweg,et al.  Evolutionary Consequences of Coevolving Targets , 1997, Evolutionary Computation.

[15]  J. Yang,et al.  Near-optimal feature selection for large databases , 2009, J. Oper. Res. Soc..

[16]  Dongjoon Kong,et al.  A New Feature Selection Method for One-Class Classification Problems , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[17]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[18]  Nicoletta Dessì,et al.  An evolutionary method for combining different feature selection criteria in microarray data classification , 2009 .

[19]  Joaquín A. Pacheco,et al.  Analysis of new variable selection methods for discriminant analysis , 2006, Comput. Stat. Data Anal..

[20]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[21]  Baozong Yuan,et al.  A more efficient branch and bound algorithm for feature selection , 1993, Pattern Recognit..

[22]  Myong Kee Jeong,et al.  A two-stage classification procedure for near-infrared spectra based on multi-scale vertical energy wavelet thresholding and SVM-based gradient-recursive feature elimination , 2009, J. Oper. Res. Soc..

[23]  Alper Ekrem Murat,et al.  A discrete particle swarm optimization method for feature selection in binary classification problems , 2010, Eur. J. Oper. Res..

[24]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[25]  Myong Kee Jeong,et al.  A two-leveled symbiotic evolutionary algorithm for clustering problems , 2012, Applied Intelligence.

[26]  J. Stuart Aitken,et al.  Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes , 2005, BMC Bioinformatics.

[27]  Sung-Bae Cho,et al.  Efficient huge-scale feature selection with speciated genetic algorithm , 2005 .

[28]  Seoung Bum Kim,et al.  Genetic algorithm-based feature selection in high-resolution NMR spectra , 2008, Expert Syst. Appl..

[29]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[30]  Joaquín A. Pacheco,et al.  A variable selection method based on Tabu search for logistic regression models , 2009, Eur. J. Oper. Res..

[31]  Myong Kee Jeong,et al.  Support vector-based feature selection using Fisher's linear discriminant and Support Vector Machine , 2010, Expert Syst. Appl..