Fast Support Vector Classification for Large-Scale Problems

The support vector machine (SVM) is a very important machine learning algorithm with state-of-the-art performance on many classification problems. However, on large datasets it is very slow and requires much memory. To solve this defficiency, we propose the fast support vector classifier (FSVC) that includes: 1) an efficient closed-form training free of any numerical iterative procedure; 2) a small collection of class prototypes that avoids to store in memory an excessive number of support vectors; and 3) a fast method that selects the spread of the radial basis function kernel directly from data, without classifier execution nor iterative hyper-parameter tuning. The memory requirements of FSVC are very low, spending in average only 6<inline-formula><tex-math notation="LaTeX">$\cdot 10^{-7}$</tex-math><alternatives><mml:math><mml:mrow><mml:mo>·</mml:mo><mml:msup><mml:mn>10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn>7</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math><inline-graphic xlink:href="delgado-ieq1-3085969.gif"/></alternatives></inline-formula> sec. per pattern, input and class, and processing datasets up to 31 millions of patterns, 30,000 inputs and 131 classes in less than 1.5 hours (less than 3 hours with only 2GB of RAM). In average, the FSVC is 10 times faster, requires 12 times less memory and achieves 4.7 percent more performance than Liblinear, that fails on the 4 largest datasets by lack of memory, being 100 times faster and achieving only 6.7 percent less performance than Libsvm. The time spent by FSVC only depends on the dataset size and thus it can be accurately estimated for new datasets, while Libsvm or Liblinear are much slower on “difficult” datasets, even if they are small. The FSVC adjusts its requirements to the available memory, classifying large datasets in computers with limited memory. Code for the proposed algorithm in the Octave scientific programming language is provided.<sup>1</sup>

[1]  Antônio de Pádua Braga,et al.  Width optimization of RBF kernels for binary classification of support vector machines: A density estimation-based approach , 2019, Pattern Recognit. Lett..

[2]  Xianli Pan,et al.  A Novel and Safe Two-Stage Screening Method for Support Vector Machine , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[3]  M. Kawulok,et al.  Selecting training sets for support vector machines: a review , 2019, Artificial Intelligence Review.

[4]  Peter Tiño,et al.  Indefinite Core Vector Machine , 2017, Pattern Recognit..

[5]  Aboul Ella Hassanien,et al.  A BA-based algorithm for parameter optimization of Support Vector Machine , 2017, Pattern Recognit. Lett..

[6]  Francisco Herrera,et al.  An Evolutionary Multiobjective Model and Instance Selection for Support Vector Machines With Pareto-Based Ensembles , 2017, IEEE Transactions on Evolutionary Computation.

[7]  Pedro Antonio Gutiérrez,et al.  A Study on Multi-Scale Kernel Optimisation via Centered Kernel-Target Alignment , 2016, Neural Processing Letters.

[8]  Le Song,et al.  Scalable Kernel Methods via Doubly Stochastic Gradients , 2014, NIPS.

[9]  Zhiliang Liu,et al.  Kernel Parameter Selection for Support Vector Machine Classification , 2014 .

[10]  Tara N. Sainath,et al.  Kernel methods match Deep Neural Networks on TIMIT , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Y. Shao,et al.  Proximal parametric-margin support vector classifier and its applications , 2014, Neural Computing and Applications.

[12]  José Neves,et al.  Direct Kernel Perceptron (DKP): Ultra-fast kernel ELM-based classification with non-iterative closed-form weight calculation , 2014, Neural Networks.

[13]  Tao Liu,et al.  Fast pruning superfluous support vectors in SVMs , 2013, Pattern Recognit. Lett..

[14]  Robert Sabourin,et al.  A dynamic model selection strategy for support vector machine classifiers , 2012, Appl. Soft Comput..

[15]  Nathan Srebro,et al.  Beating SGD: Learning SVMs in Sublinear Time , 2011, NIPS.

[16]  Shiliang Sun,et al.  A review of optimization methodologies in support vector machines , 2011, Neurocomputing.

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[18]  Jinglu Hu,et al.  A fast SVM training method for very large datasets , 2009, 2009 International Joint Conference on Neural Networks.

[19]  Jian Xu,et al.  Cluster Reduction Support Vector Machine for Large-Scale Data Set Classification , 2008, 2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application.

[20]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[21]  Y. Singer,et al.  Pegasos: Primal Estimated sub-GrAdient SOlver for SVM , 2011, ICML.

[22]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[23]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[24]  Francisco Herrera,et al.  Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study , 2003, IEEE Trans. Evol. Comput..

[25]  Chih-Jen Lin,et al.  A study on reduced support vector machines , 2003, IEEE Trans. Neural Networks.

[26]  Wenjian Wang,et al.  Determination of the spread parameter in the Gaussian kernel for classification and regression , 2003, Neurocomputing.

[27]  Francisco Herrera,et al.  Evolutionary wrapper approaches for training set selection as preprocessing mechanism for support vector machines: Experimental evaluation and support vector analysis , 2016, Appl. Soft Comput..

[28]  Mahesh Panchal,et al.  A Review on Support Vector Machine for Data Classification , 2012 .