Fast Support Vector Machines for Continuous Data

Support vector machines (SVMs) can be trained to be very accurate classifiers and have been used in many applications. However, the training time and, to a lesser extent, prediction time of SVMs on very large data sets can be very long. This paper presents a fast compression method to scale up SVMs to large data sets. A simple bit-reduction method is applied to reduce the cardinality of the data by weighting representative examples. We then develop SVMs trained on the weighted data. Experiments indicate that bit-reduction SVM produces a significant reduction in the time required for both training and prediction with minimum loss in accuracy. It is also shown to typically be more accurate than random sampling when the data are not overcompressed.

[1]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[2]  Federico Girosi,et al.  Reducing the run-time complexity of Support Vector Machines , 1999 .

[3]  Jianchang Mao,et al.  Scaling-up support vector machines using boosting algorithm , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[5]  Jian-xiong Dong,et al.  A Fast SVM Training Algorithm , 2002, SVM.

[6]  Yiguang Liu,et al.  A novel and quick SVM-based multi-class classifier , 2006, Pattern Recognit..

[7]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[8]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Kai Zhang,et al.  A fast approximate algorithm for training L1-SVMs in primal space , 2007, Neurocomputing.

[10]  G DietterichThomas Approximate statistical tests for comparing supervised classification learning algorithms , 1998 .

[11]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[12]  William G. Cochran,et al.  Sampling Techniques, 3rd Edition , 1963 .

[13]  M. Narasimha Murty,et al.  Scalable non-linear Support Vector Machine using hierarchical clustering , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[14]  David R. Karger,et al.  Text Bundling: Statistics Based Data-Reduction , 2003, ICML.

[15]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[16]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[17]  S. Sathiya Keerthi,et al.  Parallel sequential minimal optimization for the training of support vector machines , 2006, IEEE Trans. Neural Networks.

[18]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[19]  M. Narasimha Murty,et al.  Multiclass core vector machine , 2007, ICML '07.

[20]  Theodore Johnson,et al.  Squashing flat files flatter , 1999, KDD '99.

[21]  Christian Posse,et al.  Likelihood-Based Data Squashing: A Modeling Approach to Instance Construction , 2002, Data Mining and Knowledge Discovery.

[22]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[23]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[24]  Samy Bengio,et al.  Scaling Large Learning Problems with Hard Parallel Mixtures , 2002, SVM.

[25]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[26]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[27]  Lawrence O. Hall,et al.  Recognizing plankton images from the shadow image particle profiling evaluation recorder , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[28]  David R. Musicant,et al.  Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[29]  Edward Y. Chang,et al.  Incremental approximate matrix factorization for speeding up support vector machines , 2006, KDD '06.

[30]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[31]  Padhraic Smyth,et al.  Towards scalable support vector machines using squashing , 2000, KDD '00.

[32]  Lawrence O. Hall,et al.  Fast Accurate Fuzzy Clustering through Data Reduction , 2003 .

[33]  Lawrence O. Hall,et al.  Active learning to recognize multiple types of plankton , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[34]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[35]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[36]  Daniel Boley,et al.  Training Support Vector Machines Using Adaptive Clustering , 2004, SDM.

[37]  Bernhard Schölkopf,et al.  Improving the accuracy and speed of support vector learning machines , 1997, NIPS 1997.

[38]  Art B. Owen,et al.  Data Squashing by Empirical Likelihood , 2004, Data Mining and Knowledge Discovery.

[39]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[40]  A. Winsor Sampling techniques. , 2000, Nursing times.

[41]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[42]  Lawrence O. Hall,et al.  Bit reduction support vector machine , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[43]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.

[44]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[46]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[47]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[48]  De-Shuang Huang,et al.  Lidar signal denoising using least-squares support vector machine , 2005, IEEE Signal Processing Letters.

[49]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[50]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[51]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.