An algorithm to cluster data for efficient classification of support vector machines

Support vector machines (SVM) are widely applied to various classification problems. However, most SVM need lengthy computation time when faced with a large and complicated dataset. This research develops a clustering algorithm for efficient learning. The method mainly categorizes data into clusters, and finds critical data in clusters as a substitute for the original data to reduce the computational complexity. The computational experiments presented in this paper show that the clustering algorithm significantly advances SVM learning efficiency.

[1]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Kyung-shik Shin,et al.  An application of support vector machines in bankruptcy prediction model , 2005, Expert Syst. Appl..

[4]  Shigeo Abe Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[5]  Don R. Hush,et al.  Polynomial-Time Decomposition Algorithms for Support Vector Machines , 2003, Machine Learning.

[6]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[7]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[8]  Hiroshi Nakamura,et al.  Multidimensional support vector machines for visualization of gene expression data , 2004, SAC '04.

[9]  Xing Li,et al.  Reduce the number of support vectors by using clustering techniques , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[10]  Antônio de Pádua Braga,et al.  SVM-KM: speeding SVMs learning with a priori cluster selection and k-means , 2000, Proceedings. Vol.1. Sixth Brazilian Symposium on Neural Networks.

[11]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[12]  Hiroshi Nakamura,et al.  Multidimensional support vector machines for visualization of gene expression data , 2004, SAC '04.

[13]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  LIANGYanchun,et al.  A fast SVM training algorithm based on the set segmentation and k-means clustering~ , 2003 .