Prototype selection for dissimilarity-based classifiers

A conventional way to discriminate between objects represented by dissimilarities is the nearest neighbor method. A more efficient and sometimes a more accurate solution is offered by other dissimilarity-based classifiers. They construct a decision rule based on the entire training set, but they need just a small set of prototypes, the so-called representation set, as a reference for classifying new objects. Such alternative approaches may be especially advantageous for non-Euclidean or even non-metric dissimilarities. The choice of a proper representation set for dissimilarity-based classifiers is not yet fully investigated. It appears that a random selection may work well. In this paper, a number of experiments has been conducted on various metric and non-metric dissimilarity representations and prototype selection methods. Several procedures, like traditional feature selection methods (here effectively searching for prototypes), mode seeking and linear programming are compared to the random selection. In general, we find out that systematic approaches lead to better results than the random selection, especially for a small number of prototypes. Although there is no single winner as it depends on data characteristics, the k-centres works well, in general. For two-class problems, an important observation is that our dissimilarity-based discrimination functions relying on significantly reduced prototype sets (3-10% of the training objects) offer a similar or much better classification accuracy than the best k-NN rule on the entire training set. This may be reached for multi-class data as well, however such problems are more difficult.

[1]  Shimon Edelman,et al.  Representation and recognition in vision , 1999 .

[2]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[3]  Luisa Micó,et al.  Comparison of fast nearest neighbour classifiers for handwritten character recognition , 1998, Pattern Recognit. Lett..

[4]  Francesco Ricci,et al.  Advanced metrics for class-driven similarity search , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[5]  Joachim M. Buhmann,et al.  Going Metric: Denoising Pairwise Data , 2002, NIPS.

[6]  Azriel Rosenfeld,et al.  Progress in pattern recognition , 1985 .

[7]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  Robert P. W. Duin,et al.  Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix , 1998, Pattern Recognit. Lett..

[10]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[11]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[12]  Horst Bunke,et al.  On Not Making Dissimilarities Euclidean , 2004, SSPR/SPR.

[13]  Elzbieta Pekalska,et al.  The Dissimilarity representations in pattern recognition. Concepts, theory and applications. , 2005 .

[14]  Paul S. Bradley,et al.  Feature Selection via Mathematical Programming , 1997, INFORMS J. Comput..

[15]  Horst Bunke,et al.  Towards Bridging the Gap between Statistical and Structural Pattern Recognition: Two New Concepts in Graph Matching , 2001, ICAPR.

[16]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[17]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[18]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  James McNames,et al.  A Fast Nearest-Neighbor Algorithm Based on a Principal Axis Search Tree , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[21]  W. Eric L. Grimson,et al.  Prototype optimization for nearest-neighbor classification , 2002, Pattern Recognit..

[22]  Robert P. W. Duin,et al.  Dissimilarity-based classification of spectra: computational issues , 2003, Real Time Imaging.

[23]  Robert P. W. Duin,et al.  Featureless pattern classification , 1998, Kybernetika.

[24]  Enrique Vidal,et al.  A class-dependent weighted dissimilarity measure for nearest neighbor classification problems , 2000, Pattern Recognit. Lett..

[25]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[26]  Hans-Peter Kriegel,et al.  Fast nearest neighbor search in high-dimensional space , 1998, Proceedings 14th International Conference on Data Engineering.

[27]  Robert P. W. Duin,et al.  Dissimilarity representations allow for building good classifiers , 2002, Pattern Recognit. Lett..

[28]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[29]  Robert P.W. Duin,et al.  PRTools3: A Matlab Toolbox for Pattern Recognition , 2000 .

[30]  Patrick Grother,et al.  Fast implementations of nearest neighbor classifiers , 1997, Pattern Recognit..

[31]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[32]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  David A. Landgrebe,et al.  Signal Theory Methods in Multispectral Remote Sensing , 2003 .

[34]  Anil K. Jain,et al.  A modified Hausdorff distance for object matching , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[35]  Luisa Micó,et al.  A fast branch & bound nearest neighbour classifier in metric spaces , 1996, Pattern Recognit. Lett..

[36]  F. Pla,et al.  Improving the k-NCN classification rule through heuristic modifications , 1998, Pattern Recognit. Lett..

[37]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[38]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[39]  Anil K. Jain,et al.  Representation and Recognition of Handwritten Digits Using Deformable Templates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Jérôme Gouzy,et al.  ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons , 2000, Nucleic Acids Res..

[42]  Daphna Weinshall,et al.  Classification with Nonmetric Distances: Image Retrieval and Class Representation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  N. JARDINE,et al.  A New Approach to Pattern Recognition , 1971, Nature.

[44]  Pavel Pudil,et al.  Road sign classification using Laplace kernel classifier , 2000, Pattern Recognit. Lett..

[45]  R. C. Williamson,et al.  Classification on proximity data with LP-machines , 1999 .

[46]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[47]  Filiberto Pla,et al.  Prototype selection for the nearest neighbour rule through proximity graphs , 1997, Pattern Recognit. Lett..

[48]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[49]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[50]  Yizong Cheng,et al.  Mean Shift, Mode Seeking, and Clustering , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[52]  Francesc J. Ferri,et al.  An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering , 2002, Pattern Recognit..

[53]  Luisa Micó,et al.  A modification of the LAESA algorithm for approximated k-NN classification , 2003, Pattern Recognit. Lett..

[54]  Kuldip K. Paliwal,et al.  Fast nearest-neighbor search algorithms based on approximation-elimination search , 2000, Pattern Recognit..