Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces

Prototype-based classification relies on the distances between the examples to be classified and carefully chosen prototypes. A small set of prototypes is of interest to keep the computational complexity low, while maintaining high classification accuracy. An experimental study of some old and new prototype optimisation techniques is presented, in which the prototypes are either selected or generated from the given data. These condensing techniques are evaluated on real data, represented in vector spaces, by comparing their resulting reduction rates and classification performance. Usually the determination of prototypes is studied in relation with the nearest neighbour rule. We will show that the use of more general dissimilarity-based classifiers can be more beneficial. An important point in our study is that the adaptive condensing schemes here discussed allow the user to choose the number of prototypes freely according to the needs. If such techniques are combined with linear dissimilarity-based classifiers, they provide the best trade-off of small condensed sets and high classification accuracy.

[1]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[2]  Robert P.W. Duin,et al.  PRTools3: A Matlab Toolbox for Pattern Recognition , 2000 .

[3]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[4]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[5]  F. Pla,et al.  An Adaptive Condensing Algorithm Based on Mixtures of Gaussians , 2004 .

[6]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[7]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[8]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[9]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[10]  Bidyut Baran Chaudhuri,et al.  A new definition of neighborhood of a point in multi-dimensional space , 1996, Pattern Recognit. Lett..

[11]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[12]  Jorma Laaksonen,et al.  LVQ_PAK: The Learning Vector Quantization Program Package , 1996 .

[13]  David G. Stork,et al.  Pattern Classification , 1973 .

[14]  Horst Bunke,et al.  On Not Making Dissimilarities Euclidean , 2004, SSPR/SPR.

[15]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[16]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Belur V. Dasarathy,et al.  Minimal consistent set (MCS) identification for optimal nearest neighbor decision systems design , 1994, IEEE Trans. Syst. Man Cybern..

[18]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[19]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[20]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[21]  Josef Kittler,et al.  Divergence Based Feature Selection for Multimodal Class Densities , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Filiberto Pla,et al.  Using the Geometrical Distribution of Prototypes for Training Set Condensing , 2003, CAEPIA.

[23]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[24]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[25]  Robert P. W. Duin,et al.  Dissimilarity-based classification of spectra: computational issues , 2003, Real Time Imaging.

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  Robert P. W. Duin,et al.  Dissimilarity representations allow for building good classifiers , 2002, Pattern Recognit. Lett..

[28]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[29]  C. H. Chen,et al.  A sample set condensation algorithm for the class sensitive artificial neural network , 1996, Pattern Recognit. Lett..

[30]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[31]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[32]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[33]  Chin-Liang Chang,et al.  Finding Prototypes For Nearest Neighbor Classifiers , 1974, IEEE Transactions on Computers.

[34]  Filiberto Pla,et al.  On the use of neighbourhood-based non-parametric classifiers , 1997, Pattern Recognit. Lett..

[35]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[36]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.