Solving Linear SVMs with Multiple 1D Projections

We present a new methodology for solving linear Support Vector Machines (SVMs) that capitalizes on multiple 1D projections. We show that the approach approximates the optimal solution with high accuracy and comes with analytical guarantees. Our solution adapts on methodologies from random projections, exponential search, and coordinate descent. In our experimental evaluation, we compare our approach with the popular liblinear SVM library. We demonstrate a significant speedup on various benchmarks. At the same time, the new methodology provides a comparable or better approximation factor of the optimal solution and exhibits smooth convergence properties. Our results are accompanied by bounds on the time complexity and accuracy.

[1]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[2]  K. Perez Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment , 2014 .

[3]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[4]  Dennis DeCoste,et al.  Anytime Query-Tuned Kernel Machines via Cholesky Factorization , 2003, SDM.

[5]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[6]  David H. Mathews,et al.  Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change , 2006, BMC Bioinformatics.

[7]  Edo Liberty,et al.  The Mailman algorithm: A note on matrix-vector multiplication , 2009, Inf. Process. Lett..

[8]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[9]  Shigeo Abe,et al.  Analysis of support vector machines , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[10]  Margarita Osadchy,et al.  Hybrid Classifiers for Object Classification with a Rich Background , 2012, ECCV.

[11]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[12]  Steve R. Gunn,et al.  Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[13]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[14]  Christos Boutsidis,et al.  Random Projections for Support Vector Machines , 2012, AISTATS.

[15]  Chia-Hua Ho,et al.  Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[16]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[17]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[18]  Chih-Jen Lin,et al.  Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..

[19]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[20]  S. Sathiya Keerthi,et al.  A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[21]  B. Roe,et al.  Boosted decision trees as an alternative to artificial neural networks for particle identification , 2004, physics/0408124.

[22]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[23]  Chih-Jen Lin,et al.  Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines , 2008, J. Mach. Learn. Res..

[24]  Chia-Hua Ho,et al.  An improved GLMNET for l1-regularized logistic regression , 2011, J. Mach. Learn. Res..

[25]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[26]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[27]  Andrew Chi-Chih Yao,et al.  An Almost Optimal Algorithm for Unbounded Searching , 1976, Inf. Process. Lett..

[28]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[29]  Chih-Jen Lin,et al.  Trust region Newton methods for large-scale logistic regression , 2007, ICML '07.

[30]  C. Bouman,et al.  Accelerated Line Search for Coordinate Descent Optimization , 2006, 2006 IEEE Nuclear Science Symposium Conference Record.

[31]  Ramesh Hariharan,et al.  A Randomized Algorithm for Large Scale Support Vector Learning , 2007, NIPS.

[32]  Friedhelm Schwenker,et al.  Hierarchical support vector machines for multi-class pattern recognition , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[33]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..