A Novel Kernel Method for Clustering

Kernel methods are algorithms that, by replacing the inner product with an appropriate positive definite function, implicitly perform a nonlinear mapping of the input data into a high-dimensional feature space. In this paper, we present a kernel method for clustering inspired by the classical k-means algorithm in which each cluster is iteratively refined using a one-class support vector machine. Our method, which can be easily implemented, compares favorably with respect to popular clustering algorithms, like k-means, neural gas, and self-organizing maps, on a synthetic data set and three UCI real data benchmarks (IRIS data, Wisconsin breast cancer database, Spam database).

[1]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[2]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[3]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[4]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[5]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[6]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[9]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[10]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[11]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[12]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[13]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[14]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[17]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[18]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[19]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[20]  M. S. Bazaraa,et al.  Nonlinear Programming , 1979 .

[21]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[22]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[23]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .