A fast approximation algorithm for 1-norm SVM with squared loss

1-norm support vector machine (SVM) has attracted substantial attentions for its good sparsity. However, the computational complexity of training 1-norm SVM is about the cube of the sample number, which is high. This paper replaces the hinge loss or the ε-insensitive loss by the squared loss in the 1-norm SVM, and applies orthogonal matching pursuit (OMP) to approximate the solution of the 1-norm SVM with the squared loss. Experimental results on toy and real-world datasets show that OMP can faster train 1-norm SVM and achieve similar learning performance compared with some methods available.

[1]  Lie Wang,et al.  Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise , 2011, IEEE Transactions on Information Theory.

[2]  Kristin P. Bennett,et al.  Combining support vector and mathematical programming methods for classification , 1999 .

[3]  Robert D. Nowak,et al.  A bound optimization approach to wavelet-based image deconvolution , 2005, IEEE International Conference on Image Processing 2005.

[4]  Zhifeng Zhang,et al.  Adaptive time-frequency decompositions , 1994 .

[5]  John Shawe-Taylor,et al.  Generalisation Error Bounds for Sparse Linear Classifiers , 2000, COLT.

[6]  R. Tibshirani,et al.  �-norm Support Vector Machines , 2003 .

[7]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[8]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[9]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[10]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[11]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[12]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[13]  Li Zhang,et al.  Analysis of programming properties and the row-column generation method for 1-norm support vector machines , 2013, Neural Networks.

[14]  Ingo Steinwart,et al.  Sparseness of Support Vector Machines , 2003, J. Mach. Learn. Res..

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Laura Schweitzer,et al.  Advances In Kernel Methods Support Vector Learning , 2016 .

[17]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[18]  Li Zhang,et al.  Hidden Space Principal Component Analysis , 2006, PAKDD.

[19]  Robert J. Vanderbei,et al.  Linear Programming: Foundations and Extensions , 1998, Kluwer international series in operations research and management service.

[20]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[21]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[22]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[23]  Glenn Fung,et al.  A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[24]  Emmanuel J. Candès,et al.  NESTA: A Fast and Accurate First-Order Method for Sparse Recovery , 2009, SIAM J. Imaging Sci..

[25]  Robert D. Nowak,et al.  An EM algorithm for wavelet-based image restoration , 2003, IEEE Trans. Image Process..

[26]  Li Zhang,et al.  A fast algorithm for kernel 1-norm support vector machines , 2013, Knowl. Based Syst..

[27]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[28]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[29]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[30]  A. Barron,et al.  Approximation and learning by greedy algorithms , 2008, 0803.1718.

[31]  Li Zhang,et al.  On the sparseness of 1-norm support vector machines , 2010, Neural Networks.

[32]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[33]  S. Mallat,et al.  Adaptive greedy approximations , 1997 .

[34]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[35]  Pascal Vincent,et al.  Kernel Matching Pursuit , 2002, Machine Learning.

[36]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[37]  Ronald A. DeVore,et al.  Deterministic constructions of compressed sensing matrices , 2007, J. Complex..

[38]  R. DeVore,et al.  A Simple Proof of the Restricted Isometry Property for Random Matrices , 2008 .

[39]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[40]  Simon Haykin,et al.  Generalized support vector machines , 1999, ESANN.

[41]  Li Zhang,et al.  Density-induced margin support vector machines , 2011, Pattern Recognit..

[42]  D. Donoho,et al.  Basis pursuit , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[43]  Yin Zhang,et al.  A Fast Algorithm for Sparse Reconstruction Based on Shrinkage, Subspace Optimization, and Continuation , 2010, SIAM J. Sci. Comput..

[44]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[45]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[46]  Michael B. Wakin,et al.  Analysis of Orthogonal Matching Pursuit Using the Restricted Isometry Property , 2009, IEEE Transactions on Information Theory.

[47]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[48]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[49]  Li Zhang,et al.  Hidden space support vector machines , 2004, IEEE Transactions on Neural Networks.

[50]  Federico Girosi,et al.  An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.

[51]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[52]  Li Zhang,et al.  Linear programming support vector machines , 2002, Pattern Recognit..

[53]  Olvi L. Mangasarian,et al.  Exact 1-Norm Support Vector Machines Via Unconstrained Convex Differentiable Minimization , 2006, J. Mach. Learn. Res..

[54]  Manfred K. Warmuth,et al.  Sample compression, learnability, and the Vapnik-Chervonenkis dimension , 1995, Machine Learning.