Proportional Hybrid Mechanism for Population Based Feature Selection Algorithm

Feature selection is an important research field for pattern classification, data mining, etc. Population-based optimization algorithms (POA) have high parallelism and are widely used as search algorithm for feature selection. Population-based feature selection algorithms (PFSA) involve compromise between precision and time cost. In order to optimize the PFSA, the feature selection models need to be improved. Feature selection algorithms broadly fall into two categories: the filter model and the wrapper model. The filter model is fast but less precise; while the wrapper model is more precise but generally computationally more intensive. In this paper, we proposed a new mechanism — proportional hybrid mechanism (PHM) to combine the advantages of filter and wrapper models. The mechanism can be applied in PFSA to improve their performance. Genetic algorithm (GA) has been applied in many kinds of feature selection problems as search algorithm because of its high efficiency and implicit parallelism. Therefore, GAs are used in this paper. In order to validate the mechanism, seven datasets from university of California Irvine (UCI) database and artificial toy datasets are tested. The experiments are carried out for different GAs, classifiers, and evaluation criteria, the results show that with the introduction of PHM, the GA-based feature selection algorithm can be improved in both time cost and classification accuracy. Moreover, the comparison of GA-based, PSO-based and some other feature selection algorithms demonstrate that the PHM can be used in other population-based feature selection algorithms and obtain satisfying results.

[1]  Ratna Babu Chinnam,et al.  mr2PSO: A maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification , 2011, Inf. Sci..

[2]  Jian Qin,et al.  A dynamic chain-like agent genetic algorithm for global numerical optimization and feature selection , 2009, Neurocomputing.

[3]  Shian-Shyong Tseng,et al.  A two-phase feature selection method using both filter and wrapper , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[4]  Seoung Bum Kim,et al.  Genetic algorithm-based feature selection in high-resolution NMR spectra , 2008, Expert Syst. Appl..

[5]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[6]  Nicolas Molinari,et al.  A new genetic algorithm in proteomics: Feature selection for SELDI-TOF data , 2008, Comput. Stat. Data Anal..

[7]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[8]  Chun-Nan Hsu,et al.  The ANNIGMA-wrapper approach to fast feature selection for neural nets , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[9]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[10]  Maria do Carmo Nicoletti,et al.  Investigating a wrapper approach for selecting features using constructive neural networks , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[11]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Michalis E. Blazadonakis,et al.  Wrapper filtering criteria via linear neuron and kernel approaches , 2008, Comput. Biol. Medicine.

[13]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[14]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[15]  Xiaoming Xu,et al.  A hybrid genetic algorithm for feature selection wrapper based on mutual information , 2007, Pattern Recognit. Lett..

[16]  Pedro Larrañaga,et al.  Filter versus wrapper gene selection approaches in DNA microarray domains , 2004, Artif. Intell. Medicine.

[17]  Lalit M. Patnaik,et al.  Adaptive probabilities of crossover and mutation in genetic algorithms , 1994, IEEE Trans. Syst. Man Cybern..

[18]  Yi Liu,et al.  FS_SFS: A novel feature selection method for support vector machines , 2006, Pattern Recognit..

[19]  Anil K. Jain,et al.  A wrapper-based approach to image segmentation and classification , 2004, IEEE Transactions on Image Processing.

[20]  Hao Dong,et al.  An improved particle swarm optimization for feature selection , 2011 .

[21]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[22]  Jacob Zahavi,et al.  Using simulated annealing to optimize the feature selection problem in marketing applications , 2006, Eur. J. Oper. Res..

[23]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[24]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[25]  Shih-Wei Lin,et al.  Particle swarm optimization for parameter determination and feature selection of support vector machines , 2008, Expert Syst. Appl..

[26]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..