Visualization of Support Vector Machines with Unsupervised Learning

The visualization of support vector machines in realistic settings is a difficult problem due to the high dimensionality of the typical datasets involved. However, such visualizations usually aid the understanding of the model and the underlying processes, especially in the biosciences. Here we propose a novel visualization technique of support vector machines based on unsupervised learning, specifically self-organizing maps. Conceptually, self-organizing maps can be thought of as neural networks that investigate a high-dimensional data space for clusters of data points and then project the clusters onto a two-dimensional map preserving the topologies of the original clusters as much as possible. This allows for the visualization of high-dimensional datasets together with their support vector models. With this technique we investigate a number of support vector machine visualization scenarios based on real world biomedical datasets

[1]  Olvi L. Mangasarian,et al.  Nuclear feature extraction for breast tumor diagnosis , 1993, Electronic Imaging.

[2]  Vasant Honavar,et al.  Gaining insights into support vector machine pattern classifiers using projection-based tour methods , 2001, KDD '01.

[3]  Ivan Bratko,et al.  Nomograms for visualizing support vector machines , 2005, KDD '05.

[4]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[6]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[7]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[8]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[9]  Vladimir Naumovich Vapni The Nature of Statistical Learning Theory , 1995 .

[10]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[11]  Kristin P. Bennett,et al.  Support vector machines: hype or hallelujah? , 2000, SKDD.

[12]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[13]  François Poulet,et al.  SVM and graphical algorithms: a cooperative approach , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[14]  William Nick Street,et al.  Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[15]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Marko Grobelnik,et al.  Feature Selection Using Linear Support Vector Machines , 2002 .

[18]  Chris W. Brown,et al.  Screening Pap Smears with Near-Infrared Spectroscopy , 1995 .

[19]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .