Prototype Selection with Compact Sets and Extended Rough Sets

In this paper, we propose a generalization of classical Rough Sets, the Nearest Neighborhood Rough Sets, by modifying the indiscernible relation without using any similarity threshold. We also combine these Rough Sets with Compact Sets, to obtain a prototype selection algorithm for Nearest Prototype Classification of mixed and incomplete data as well as arbitrarily dissimilarity functions. We introduce a set of rules to a priori predict the performance of the proposed prototype selection algorithm. Numerical experiments over repository databases show the high quality performance of the method proposed in this paper according to classifier accuracy and object reduction.

[1]  Belur V. Dasarathy,et al.  Nearest Neighbour Editing and Condensing Tools–Synergy Exploitation , 2000, Pattern Analysis & Applications.

[2]  Daniel Vanderpooten,et al.  A Generalized Definition of Rough Approximations Based on Similarity , 2000, IEEE Trans. Knowl. Data Eng..

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[5]  José Francisco Martínez Trinidad,et al.  The logical combinatorial approach to pattern recognition, an overview through selected works , 2001, Pattern Recognit..

[6]  José Francisco Martínez Trinidad,et al.  Finding Small Consistent Subset for the Nearest Neighbor Classifier Based on Support Graphs , 2009, CIARP.

[7]  José Francisco Martínez Trinidad,et al.  Using Maximum Similarity Graphs to Edit Nearest Neighbor Classifiers , 2009, CIARP.

[8]  Luis Alvarez,et al.  Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications , 2012, Lecture Notes in Computer Science.

[9]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[10]  Q. Henry Wu,et al.  A class boundary preserving algorithm for data condensation , 2011, Pattern Recognit..

[11]  Qinghua Hu,et al.  Neighborhood rough set based heterogeneous feature subset selection , 2008, Inf. Sci..

[12]  Francisco Herrera,et al.  A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13]  Nicolás García-Pedrajas,et al.  Democratic instance selection: A linear complexity instance selection algorithm based on classifier ensemble concepts , 2010, Artif. Intell..

[14]  Rafael Bello,et al.  A Method to Edit Training Set Based on Rough Sets , 2007 .

[15]  Sebastián Ventura,et al.  Multiple Instance Learning with Multiple Objective Genetic Programming for Web Mining , 2011, Appl. Soft Comput..

[16]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..