An evolutionary attribute clustering and selection method based on feature similarity

In the past, we proposed a GA-based clustering method for attribute clustering and feature selection. The fitness of each individual was evaluated using both the average accuracy of attribute substitutions in clusters and the cluster balance. The evaluation was, however, quite time-consuming. In this paper, we modify the previous method for a better execution performance based on feature similarity and feature dependence. The fitness of a chromosome combines both the total degrees of similarity between pairs of features and the accuracy of centers rather than the average accuracy by all the combinations. Experimental results also show the performance of the proposed approach.

[1]  Zbigniew Michalewicz,et al.  Evolutionary computation: practical issues , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[2]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Kun Gao,et al.  Sampling-based tasks scheduling in dynamic grid environment , 2005 .

[4]  Simon C. K. Shiu,et al.  Combining feature reduction and case selection in building CBR classifiers , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Andrzej Skowron,et al.  The Discernibility Matrices and Functions in Information Systems , 1992, Intelligent Decision Support.

[6]  Tzung-Pei Hong,et al.  Attribute Clustering in High Dimensional Feature Spaces , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[7]  Z. Pawlak,et al.  Why rough sets? , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[8]  Hamidah Ibrahim,et al.  Approximate reduct computation by rough sets based attribute weighting , 2005, 2005 IEEE International Conference on Granular Computing.

[9]  Jianchao Han,et al.  Feature selection based on rough set and information entropy , 2005, 2005 IEEE International Conference on Granular Computing.

[10]  Ian Graham,et al.  Expert Systems: Knowledge, Uncertainty and Decision , 1988 .

[11]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[12]  Gunar E. Liepins,et al.  A New Approach on the Traveling Salesman Problem by Genetic Algorithms , 1993, ICGA.

[13]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[14]  Tzung-Pei Hong,et al.  AN EFFECTIVE ATTRIBUTE CLUSTERING APPROACH FOR FEATURE SELECTION AND REPLACEMENT , 2009, Cybern. Syst..

[15]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Zhao Xiao-qing Finding minimal reducts from incomplete information systems , 2005 .

[17]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[18]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.