Simultaneous generation of prototypes and features through genetic programming

Nearest-neighbor (NN) methods are highly effective and widely used pattern classification techniques. There are, however, some issues that hinder their application for large scale and noisy data sets; including, its high storage requirements, its sensitivity to noisy instances, and the fact that test cases must be compared to all of the training instances. Prototype (PG) and feature generation (FG) techniques aim at alleviating these issues to some extent; where, traditionally, both techniques have been implemented separately. This paper introduces a genetic programming approach to tackle the simultaneous generation of prototypes and features to be used for classification with a NN classifier. The proposed method learns to combine instances and attributes to produce a set of prototypes and a new feature space for each class of the classification problem via genetic programming. An heterogeneous representation is proposed together with ad-hoc genetic operators. The proposed approach overcomes some limitations of NN without degradation in its classification performance. Experimental results are reported and compared with several other techniques. The empirical assessment provides evidence of the effectiveness of the proposed approach in terms of classification accuracy and instance/feature reduction.

[1]  Roberto Alejo,et al.  Analysis of new techniques to obtain quality training sets , 2003, Pattern Recognit. Lett..

[2]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[3]  Masoud Nikravesh,et al.  Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing) , 2006 .

[4]  Rabab Kreidieh Ward,et al.  Genetic algorithms for feature selection and weighting, a review and study , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[5]  Loris Nanni,et al.  Particle swarm optimization for prototype reduction , 2009, Neurocomputing.

[6]  Inés María Galván,et al.  AMPSO: A New Particle Swarm Method for Nearest Neighborhood Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[8]  Enrique Vidal,et al.  Learning prototypes and distances (LPD). A prototype reduction technique based on nearest neighbor error minimization , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[9]  Francisco Herrera,et al.  Prototype Selection for Nearest Neighbor Classification : Survey of Methods , 2010 .

[10]  Fernando Fernández,et al.  Evolutionary Design of Nearest Prototype Classifiers , 2004, J. Heuristics.

[11]  Hung-Ming Chen,et al.  Design of nearest neighbor classifiers: multi-objective approach , 2005, Int. J. Approx. Reason..

[12]  Francisco Herrera,et al.  Integrating a differential evolution feature weighting scheme into prototype generation , 2012, Neurocomputing.

[13]  Lakhmi C. Jain,et al.  Nearest neighbor classifier: Simultaneous editing and feature selection , 1999, Pattern Recognit. Lett..

[14]  Hugo Jair Escalante,et al.  An evolutionary multi-objective approach for prototype generation , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[15]  Marc Parizeau,et al.  Coevolution of Nearest Neighbor Classifiers , 2007, Int. J. Pattern Recognit. Artif. Intell..

[16]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[18]  Francisco Herrera,et al.  Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Hugo Jair Escalante,et al.  Genetic Programming of Prototypes for Pattern Classification , 2013, IbPRIA.

[20]  Javier Pérez-Rodríguez,et al.  A scalable approach to simultaneous evolutionary instance and feature selection , 2013, Inf. Sci..

[21]  Jack Koplowitz,et al.  On the relation of performance to editing in nearest neighbor rules , 1981, Pattern Recognit..

[22]  Francisco Herrera,et al.  A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[23]  Riccardo Poli,et al.  A Field Guide to Genetic Programming , 2008 .

[24]  James C. Bezdek,et al.  Nearest prototype classification: clustering, genetic algorithms, or random search? , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[25]  Utpal Garain,et al.  Prototype reduction using an artificial immune model , 2008, Pattern Analysis and Applications.