Genetic-Algorithm-Based Instance and Feature Selection

This chapter discusses a genetic-algorithm-based approach for selecting a small number of instances from a given data set in a pattern classification problem. Our genetic algorithm also selects a small number of features. The selected instances and features are used as a reference set in a nearest neighbor classifier. Our goal is to improve the classification ability of our nearest neighbor classifier by searching for an appropriate reference set. We first describe the implementation of our genetic algorithm for the instance and feature selection. Next we discuss the definition of a fitness function in our genetic algorithm. Then we examine the classification ability of nearest neighbor classifiers designed by our approach through computer simulations on some data sets. We also examine the effect of the instance and feature selection on the learning of neural networks. It is shown that the instance and feature selection prevents the overfitting of neural networks.

[1]  Lakhmi C. Jain,et al.  Nearest neighbor classifier: Simultaneous editing and feature selection , 1999, Pattern Recognit. Lett..

[2]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[3]  Hisao Ishibuchi,et al.  Pattern and Feature Selection by Genetic Algorithms in Nearest Neighbor Classification , 2000, Journal of Advanced Computational Intelligence and Intelligent Informatics.

[4]  Sandip Sen,et al.  Using real-valued genetic algorithms to evolve rule sets for classification , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[5]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[8]  Hisao Ishibuchi,et al.  Single-objective and two-objective genetic algorithms for selecting linguistic rules for pattern classification problems , 1997, Fuzzy Sets Syst..

[9]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[10]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[11]  Hisao Ishibuchi,et al.  Evolution of Reference Sets in Nearest Neighbor Classification , 1998, SEAL.

[12]  Ludmila I. Kuncheva,et al.  Editing for the k-nearest neighbors rule by a genetic algorithm , 1995, Pattern Recognit. Lett..

[13]  Richard J. Enbody,et al.  Further Research on Feature Selection and Classification Using Genetic Algorithms , 1993, ICGA.

[14]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[15]  Lawrence Davis,et al.  Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm , 1991, ICGA.