Practical Large Scale Classification with Additive Kernels

For classication problems with millions of training examples or dimensions, accuracy, training and testing speed and memory usage are the main concerns. Recent advances have allowed linear SVM to tackle problems with moderate time and space cost, but for many tasks in computer vision, additive kernels would have higher accuracies. In this paper, we propose the PmSVM-LUT algorithm that employs Look-Up Tables to boost the training and testing speed and save memory usage of additive kernel SVM classication, in order to meet the needs of large scale problems. The PmSVM-LUT algorithm is based on PmSVM (Wu, 2012), which employed polynomial approximation for the gradient function to speedup the dual coordinate descent method. We also analyze the polynomial approximation numerically to demonstrate its validity. Empirically, our algorithm is faster than PmSVM and feature mapping in many datasets with higher classication accuracies and can save up to 60% memory usage as well.

[1]  James M. Rehg,et al.  CENTRIST: A Visual Descriptor for Scene Categorization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Igor A. Shevchuk,et al.  Theory of Uniform Approximation of Functions by Polynomials , 2008 .

[3]  Simon J. Smith,et al.  Lebesgue constants in polynomial interpolation , 2006 .

[4]  J. Waldvogel,et al.  Numerical Analysis: A Comprehensive Introduction , 1989 .

[5]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[6]  Chia-Hua Ho,et al.  Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[7]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[8]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[9]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[10]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  J. Douglas Faires,et al.  Numerical Analysis , 1981 .

[12]  Florent Perronnin,et al.  Large-scale image categorization with explicit data embedding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[14]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[15]  Jianxin Wu,et al.  Power mean SVM for large scale visual classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[17]  Jianxin Wu,et al.  A Fast Dual Method for HIK SVM Learning , 2010, ECCV.