Support vector visualization and clustering using self-organizing map and vector one-class classification

In this paper, a new algorithm of support vector visualization and clustering (SVVC) based on self-organizing map (SOM) and support vector one-class classification (SVOCC) is presented. Original SVOCC is to identify the support domain of input data. When it is used for clustering, the high computational complexity for identifying cluster gaps between any pair points makes it less likely to be used in large data sets. In addition, the identified clusters cannot be visually displayed in high dimensions larger than three. Self-organizing map (SOM) is a neural network approach, which can project high-dimensional data into usually 2-D grid while preserving topology of input data. By using the proposed SVVC algorithm, resulting map can visually display high-dimensional cluster shapes and corresponding clusters can be found. Outliers and cluster borders can be clearly identified on the map, which is better than other visualization and clustering methods on SOM. The computational complexity of SVVC is less than the method of directly clustering by SVOCC.

[1]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[2]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[3]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[4]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[5]  Hava T. Siegelmann,et al.  Support Vector Clustering , 2002, J. Mach. Learn. Res..

[6]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[7]  Alfred Ultsch,et al.  Data Mining and Knowledge Discovery with Emergent Self-Organizing Feature Maps for Multivariate Time Series , 1999 .

[8]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[9]  Yanda Li,et al.  Self-organizing map as a new method for clustering and data analysis , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).