Sphere Support Vector Machines for large classification tasks

This paper introduces Sphere Support Vector Machines (SVMs) as the new fast classification algorithm based on combining a minimal enclosing ball approach, state of the art nearest point problem solvers and probabilistic techniques. The blending of the three significantly speeds up the training phase of SVMs and also attains practically the same accuracy as the other classification models over several large real datasets within the strict validation frame of a double (nested) cross-validation. The results shown are promoting SphereSVM as outstanding alternatives for handling large and ultra-large datasets in a reasonable time without switching to various parallelization schemes for SVM algorithms recently proposed.

[1]  Sergios Theodoridis,et al.  A geometric approach to Support Vector Machine (SVM) classification , 2006, IEEE Transactions on Neural Networks.

[2]  Tobias Scheffer,et al.  Error Estimation and Model Selection , 1999, Künstliche Intell..

[3]  José R. Dorronsoro,et al.  An MDM solver for the nearest point problem in Scaled Convex Hulls , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[4]  Thomas M. Cover,et al.  Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition , 1965, IEEE Trans. Electron. Comput..

[5]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[6]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[7]  V. N. Malozemov,et al.  Finding the Point of a Polyhedron Closest to the Origin , 1974 .

[8]  David J. Crisp,et al.  A Geometric Interpretation of ?-SVM Classifiers , 1999, NIPS 2000.

[9]  Stéphane Canu,et al.  Comments on the "Core Vector Machines: Fast SVM Training on Very Large Data Sets" , 2007, J. Mach. Learn. Res..

[10]  Václav Hlavác,et al.  An iterative algorithm learning the maximal margin classifier , 2003, Pattern Recognit..

[11]  Kenneth L. Clarkson,et al.  Smaller core-sets for balls , 2003, SODA '03.

[12]  Ivor W. Tsang,et al.  Very Large SVM Training using Core Vector Machines , 2005, AISTATS.

[13]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[14]  Qi Li,et al.  GPUSVM: a comprehensive CUDA based support vector machine package , 2011, Central European Journal of Computer Science.

[15]  Jacek M. Zurada,et al.  Generalized Core Vector Machines , 2006, IEEE Transactions on Neural Networks.

[16]  I. Tsang,et al.  Authors' Reply to the "Comments on the Core Vector Machines: Fast SVM Training on Very Large Data Sets" , 2007 .

[17]  S. Sathiya Keerthi,et al.  A fast iterative nearest point algorithm for support vector machine classifier design , 2000, IEEE Trans. Neural Networks Learn. Syst..

[18]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[19]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[20]  Richard Simon,et al.  Bias in error estimation when using cross-validation for model selection , 2006, BMC Bioinformatics.

[21]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[22]  M. Narasimha Murty,et al.  Multiclass core vector machine , 2007, ICML '07.

[23]  Qi Li,et al.  Fast parallel machine learning algorithms for large datasets using graphic processing unit , 2011 .

[24]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .