论文信息 - Beating SGD: Learning SVMs in Sublinear Time

Beating SGD: Learning SVMs in Sublinear Time

We present an optimization approach for linear SVMs based on a stochastic primal-dual approach, where the primal step is akin to an importance-weighted SGD, and the dual step is a stochastic update on the importance weights. This yields an optimization method with a sublinear dependence on the training set size, and the first method for learning linear SVMs with runtime less then the size of the training set required for learning!

[1] Russell Greiner,et al. Learning and Classifying Under Hard Budgets , 2005, ECML.

[2] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.

[3] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[4] Éva Tardos,et al. Fast Approximation Algorithms for Fractional Packing and Covering Problems , 1995, Math. Oper. Res..

[5] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[6] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[7] David P. Woodruff,et al. Sublinear Optimization for Machine Learning , 2010, FOCS.

[8] Kun Deng,et al. Bandit-Based Algorithms for Budgeted Learning , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[9] S. Sathiya Keerthi,et al. A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[10] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Claudio Gentile,et al. The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[12] Elad Hazan. The convex optimization approach to regret minimization , 2011 .

[13] Ambuj Tewari,et al. Smoothness, Low Noise and Fast Rates , 2010, NIPS.

[14] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[15] Sanjeev Arora,et al. The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[16] Nathan Srebro,et al. SVM optimization: inverse dependence on training set size , 2008, ICML '08.