Speed Up SVM Algorithm for Massive Classification Tasks

We present a new parallel and incremental Support Vector Machine (SVM) algorithm for the classification of very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic program so that this task for large datasets requires large memory capacity and long time. We extend a recent Least Squares SVM (LS-SVM) proposed by Suykens and Vandewalle for building incremental and parallel algorithm. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI and Delve dataset repositories showed that our parallel incremental algorithm using GPUs is about 70 times faster than a CPU implementation and often significantly faster (over 1000 times) than state-of-the-art algorithms like LibSVM, SVM-perf and CB-SVM.

[1]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[2]  D. W. Walker,et al.  LAPACK++: a design overview of object-oriented extensions for high performance linear algebra , 1993, Supercomputing '93.

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[5]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[6]  Huan Liu,et al.  Handling concept drifts in incremental learning with support vector machines , 1999, KDD '99.

[7]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[8]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[9]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[10]  David R. Musicant,et al.  Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[11]  Stefan Rüping,et al.  Incremental Learning with Support Vector Machines , 2001, ICDM.

[12]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[13]  Glenn Fung,et al.  Incremental Support Vector Machine Classification , 2002, SDM.

[14]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[15]  Gregory Piatetsky-Shapiro,et al.  Summary from the KDD-03 panel: data mining: the next 10 years , 2003, SKDD.

[16]  Mining Very Large Datasets with Support Vector Machine Algorithms , 2003, ICEIS.

[17]  Geoff Hulten,et al.  A General Framework for Mining Massive Data Streams , 2003 .

[18]  Slimane Hammoudi,et al.  Enterprise Information Systems V , 2004 .

[19]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[20]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[21]  François Poulet,et al.  Classifying one billion data with a new distributed svm algorithm , 2006, 2006 International Conference onResearch, Innovation and Vision for the Future.

[22]  Jean-Daniel Fekete,et al.  Large Scale Classification with Support Vector Machine Algorithms , 2007, ICMLA 2007.