Memory-efficient large-scale linear support vector machine

Stochastic gradient descent has been advanced as a computationally efficient method for large-scale problems. In classification problems, many proposed linear support vector machines are very effective. However, they assume that the data is already in memory which might be not always the case. Recent work suggests a classical method that divides such a problem into smaller blocks then solves the sub-problems iteratively. We show that a simple modification of shrinking the dataset early will produce significant saving in computation and memory. We further find that on problems larger than previously considered, our approach is able to reach solutions on top-end desktop machines while competing methods cannot.

[1]  Lawrence K. Saul,et al.  Identifying suspicious URLs: an application of large-scale online learning , 2009, ICML '09.

[2]  Chih-Jen Lin,et al.  Large Linear Classification When Data Cannot Fit in Memory , 2011, TKDD.

[3]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[4]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[7]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[8]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[9]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[10]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[11]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[12]  Nello Cristianini,et al.  The Kernel-Adatron Algorithm: A Fast and Simple Learning Procedure for Support Vector Machines , 1998, ICML.

[13]  Calton Pu,et al.  Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically , 2006, CEAS.

[14]  S. Canu,et al.  Training Invariant Support Vector Machines using Selective Sampling , 2005 .