A hybrid genetic algorithm for feature subset selection in rough set theory

Rough set theory has been proven to be an effective tool to feature subset selection. Current research usually employ hill-climbing as search strategy to select feature subset. However, they are inadequate to find the optimal feature subset since no heuristic can guarantee optimality. Due to this, many researchers study stochastic methods. Since previous works of combination of genetic algorithm and rough set theory do not show competitive performance compared with some other stochastic methods, we propose a hybrid genetic algorithm for feature subset selection in this paper, called HGARSTAR. Different from previous works, HGARSTAR embeds a novel local search operation based on rough set theory to fine-tune the search. This aims to enhance GA’s intensification ability. Moreover, all candidates (i.e. feature subsets) generated in evolutionary process are enforced to include core features to accelerate convergence. To verify the proposed algorithm, experiments are performed on some standard UCI datasets. Experimental results demonstrate the efficiency of our algorithm.

[1]  Yanqing Zhang,et al.  A genetic algorithm-based method for feature subset selection , 2008, Soft Comput..

[2]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[3]  Pedro Larrañaga,et al.  Feature Subset Selection by Bayesian network-based optimization , 2000, Artif. Intell..

[4]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Guoyin Wang,et al.  Rough reduction in algebra view and information view , 2003, Int. J. Intell. Syst..

[6]  Feifei Xu,et al.  Fuzzy-rough attribute reduction via mutual information with an application to cancer classification , 2009, Comput. Math. Appl..

[7]  Nikhil R. Pal,et al.  Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[9]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[10]  Lawrence Carin,et al.  A Bayesian approach to joint feature selection and classifier design , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Francisco Herrera,et al.  On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection , 2012, Soft Computing.

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Yuchang Lu,et al.  Feature ranking in rough sets , 2003, AI Commun..

[14]  Wang Guo,et al.  Decision Table Reduction based on Conditional Information Entropy , 2002 .

[15]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[16]  Jiye Liang,et al.  Combination Entropy and Combination Granulation in Rough Set Theory , 2008, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[17]  Carlos García-Martínez,et al.  Hybrid metaheuristics with evolutionary algorithms specializing in intensification and diversification: Overview and progress report , 2010, Comput. Oper. Res..

[18]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[19]  Duoqian Miao,et al.  A rough set approach to feature selection based on ant colony optimization , 2010, Pattern Recognit. Lett..

[20]  Shih-Wei Lin,et al.  Parameter determination and feature selection for C4.5 algorithm using scatter search approach , 2012, Soft Comput..

[21]  Li Pheng Khoo,et al.  Feature extraction using rough set theory and genetic algorithms--an application for the simplification of product quality evaluation , 2002 .

[22]  Masao Fukushima,et al.  Tabu search for attribute reduction in rough set theory , 2008, Soft Comput..

[23]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[24]  Miao Duo-qian,et al.  Information-based algorithm for reduction of knowledge , 1997, 1997 IEEE International Conference on Intelligent Processing Systems (Cat. No.97TH8335).

[25]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[26]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[27]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[28]  Antanas Verikas,et al.  Feature selection with neural networks , 2002, Pattern Recognit. Lett..

[29]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[31]  Vladik Kreinovich,et al.  Handbook of Granular Computing , 2008 .

[32]  Qinghua Hu,et al.  Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation , 2007, Pattern Recognit..

[33]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[34]  Qiang Shen,et al.  Centre for Intelligent Systems and Their Applications Fuzzy Rough Attribute Reduction with Application to Web Categorization Fuzzy Rough Attribute Reduction with Application to Web Categorization Fuzzy Sets and Systems ( ) – Fuzzy–rough Attribute Reduction with Application to Web Categorization , 2022 .

[35]  Lech Polkowski,et al.  Rough Sets in Knowledge Discovery 2 , 1998 .

[36]  Kun She,et al.  A Universal neighbourhood rough sets model for knowledge discovering from incomplete heterogeneous data , 2013, Expert Syst. J. Knowl. Eng..

[37]  Shutao Li,et al.  Gene selection using genetic algorithm and support vectors machines , 2008, Soft Comput..

[38]  Geert Wets,et al.  A hybrid system of neural networks and rough sets for road safety performance indicators , 2010, Soft Comput..

[39]  Witold Pedrycz,et al.  Granular Computing - The Emerging Paradigm , 2007 .

[40]  XIAOHUA Hu,et al.  LEARNING IN RELATIONAL DATABASES: A ROUGH SET APPROACH , 1995, Comput. Intell..

[41]  Qiang Shen,et al.  Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring , 2004, Pattern Recognit..