Feature Selection With Controlled Redundancy in a Fuzzy Rule Based Framework

Features that have good predictive power for classes or output variables are useful features and hence most feature selection methods try to find them. However, since there may be high correlation or nonlinear dependence between such good features, we may obtain a comparable performance even when we use only a few of those good features. Thus, a feature selection method should select useful features with controlled redundancy. In this paper, we propose a novel learning method that imposes a penalty on the use of dependent/correlated features during system identification along with feature selection. This feature selection scheme can choose good features, discard indifferent, and derogatory features, and can control the level of redundancy in the set of selected features. This is probably the first attempt to feature selection with redundancy control using a fuzzy rule based framework. We have demonstrated the effectiveness of this method by utilizing a tenfold cross-validation setup on a synthetic dataset as well as on several commonly used datasets for classification problems. We have also compared our results with some state-of-the-art methods.

[1]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[2]  Nikhil R. Pal,et al.  Feature Selection Using a Neural Framework With Controlled Redundancy , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Werner Dinkelbach On Nonlinear Fractional Programming , 1967 .

[4]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[5]  Tian Zheng,et al.  Identification of gene interactions associated with disease from gene expression data using synergy networks , 2008, BMC Systems Biology.

[6]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[7]  Nikhil R. Pal,et al.  An Integrated Mechanism for Feature Selection and Fuzzy Rule Extraction for Classification , 2012, IEEE Transactions on Fuzzy Systems.

[8]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[10]  Nikhil R. Pal,et al.  Redundancy-Constrained feature selection with radial basis function networks , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[11]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[12]  Nikhil R. Pal,et al.  Simultaneous Structure Identification and Fuzzy Rule Generation for Takagi–Sugeno Models , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[14]  Trevor Hastie,et al.  High-Dimensional Problems: p N , 2009 .

[15]  C. Deisy,et al.  Efficient Dimensionality Reduction Approaches for Feature Selection , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[16]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[17]  Huan Liu,et al.  Efficiently handling feature redundancy in high-dimensional data , 2003, KDD '03.

[18]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[19]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[20]  Klaus Obermayer,et al.  Nonlinear Feature Selection with the Potential Support Vector Machine , 2006, Feature Extraction.

[21]  Jesús Alcalá-Fdez,et al.  A Fuzzy Association Rule-Based Classification Model for High-Dimensional Problems With Genetic Rule Selection and Lateral Tuning , 2011, IEEE Transactions on Fuzzy Systems.

[22]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[23]  Chih-Ming Chen,et al.  An efficient fuzzy classifier with feature selection based on fuzzy entropy , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[24]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[25]  Lei Wang,et al.  Feature Selection With Redundancy-Constrained Class Separability , 2010, IEEE Transactions on Neural Networks.

[26]  Nikhil R. Pal,et al.  A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification , 2004, IEEE Transactions on Neural Networks.

[27]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[28]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[29]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[30]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.