Feature subset selection based on fuzzy entropy measures for handling classification problems

Abstract In this paper, we present a new method for dealing with feature subset selection based on fuzzy entropy measures for handling classification problems. First, we discretize numeric features to construct the membership function of each fuzzy set of a feature. Then, we select the feature subset based on the proposed fuzzy entropy measure focusing on boundary samples. The proposed method can select relevant features to get higher average classification accuracy rates than the ones selected by the MIFS method (Battiti, R. in IEEE Trans. Neural Netw. 5(4):537–550, 1994), the FQI method (De, R.K., et al. in Neural Netw. 12(10):1429–1455, 1999), the OFEI method, Dong-and-Kothari’s method (Dong, M., Kothari, R. in Pattern Recognit. Lett. 24(9):1215–1225, 2003) and the OFFSS method (Tsang, E.C.C., et al. in IEEE Trans. Fuzzy Syst. 11(2):202–213, 2003).

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Chandrika Kamath,et al.  Dimension reduction techniques and the classification of bent double galaxies , 2002, Comput. Stat. Data Anal..

[3]  Bart Kosko,et al.  Fuzzy entropy and conditioning , 1986, Inf. Sci..

[4]  Lotfi A. Zadeh,et al.  The Concepts of a Linguistic Variable and its Application to Approximate Reasoning , 1975 .

[5]  Shyi-Ming Chen,et al.  A New Approach for Handling Classification Problems Based on Fuzzy Information Gain Measures , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[6]  L. Zadeh Probability measures of Fuzzy events , 1968 .

[7]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[8]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[9]  James Nga-Kwok Liu,et al.  An elastic contour matching model for tropical cyclone pattern recognition , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[10]  Sankar K. Pal,et al.  Unsupervised Feature Selection , 2004 .

[11]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[12]  John C. Platt Using Analytic QP and Sparseness to Speed Training of Support Vector Machines , 1998, NIPS.

[13]  Shyi-Ming Chen A new approach to handling fuzzy decision-making problems , 1988 .

[14]  Paul W. Baim A Method for Attribute Selection in Inductive Learning Systems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Shyi-Ming Chen,et al.  GENERATING FUZZY RULES FROM TRAINING DATA CONTAINING NOISE FOR HANDLING CLASSIFICATION PROBLEMS , 2002, Cybern. Syst..

[16]  Shyi-Mig Chen,et al.  A new approach to handling fuzzy decision-making problems , 1988, [1988] Proceedings. The Eighteenth International Symposium on Multiple-Valued Logic.

[17]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[18]  Xizhao Wang,et al.  OFFSS: optimal fuzzy-valued feature subset selection , 2003, IEEE Trans. Fuzzy Syst..

[19]  Sankar K. Pal,et al.  Unsupervised feature evaluation: a neuro-fuzzy approach , 2000, IEEE Trans. Neural Networks Learn. Syst..

[20]  Sankar K. Pal,et al.  Feature analysis: Neural network and fuzzy set theoretic approaches , 1997, Pattern Recognit..

[21]  Shyi-Ming Chen,et al.  A New Method for Feature Subset Selection for Handling Classification Problems , 2005, The 14th IEEE International Conference on Fuzzy Systems, 2005. FUZZ '05..

[22]  Ravi Kothari,et al.  Feature subset selection using a new definition of classifiability , 2003, Pattern Recognit. Lett..

[23]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[24]  Settimo Termini,et al.  A Definition of a Nonprobabilistic Entropy in the Setting of Fuzzy Sets Theory , 1972, Inf. Control..

[25]  Shyi-Ming Chen,et al.  AUTOMATICALLY CONSTRUCTING MEMBERSHIP FUNCTIONS AND GENERATING FUZZY RULES USING GENETIC ALGORITHMS , 2002 .

[26]  Eibe Frank,et al.  Logistic Model Trees , 2003, ECML.

[27]  Sankar K. Pal,et al.  Neuro-fuzzy feature evaluation with theoretical analysis , 1999, Neural Networks.

[28]  N. Chaikla,et al.  Genetic algorithms in feature selection , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[29]  Cullen Schaffer Overfitting avoidance as bias , 2004, Machine Learning.

[30]  Chih-Ming Chen,et al.  An efficient fuzzy classifier with feature selection based on fuzzy entropy , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[31]  Shyi-Ming Chen,et al.  A NEW METHOD TO CONSTRUCT MEMBERSHIP FUNCTIONS AND GENERATE WEIGHTED FUZZY RULES FROM TRAINING INSTANCES , 2005, Cybern. Syst..