Sparse learning for support vector classification

This paper provides a sparse learning algorithm for Support Vector Classification (SVC), called Sparse Support Vector Classification (SSVC), which leads to sparse solutions by automatically setting the irrelevant parameters exactly to zero. SSVC adopts the L"0-norm regularization term and is trained by an iteratively reweighted learning algorithm. We show that the proposed novel approach contains a hierarchical-Bayes interpretation. Moreover, this model can build up close connections with some other sparse models. More specifically, one variation of the proposed method is equivalent to the zero-norm classifier proposed in (Weston et al., 2003); it is also an extended and more flexible framework in parallel with the Sparse Probit Classifier proposed by Figueiredo (2003). Theoretical justifications and experimental evaluations on two synthetic datasets and seven benchmark datasets show that SSVC offers competitive performance to SVC but needs significantly fewer Support Vectors.

[1]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[2]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[3]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[4]  Michael R. Lyu,et al.  Maxi–Min Margin Machine: Learning Large Margin Classifiers Locally and Globally , 2008, IEEE Transactions on Neural Networks.

[5]  Chen Lin,et al.  Neural Information Processing -letters and Reviews Simplify Support Vector Machines by Iterative Learning , 2022 .

[6]  Shutao Li,et al.  Gene Feature Extraction Using T-Test Statistics and Kernel Partial Least Squares , 2006, ICONIP.

[7]  Mário A. T. Figueiredo Adaptive Sparseness Using Jeffreys Prior , 2001, NIPS.

[8]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[9]  Lai-Wan Chan,et al.  The Minimum Error Minimax Probability Machine , 2004, J. Mach. Learn. Res..

[10]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Robert F. Harrison,et al.  A new method for sparsity control in support vector classification and regression , 2001, Pattern Recognit..

[13]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[14]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[15]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  D. N. Zheng,et al.  Training sparse MS-SVR with an expectation-maximization algorithm , 2006, Neurocomputing.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[19]  Tom Downs,et al.  Exact Simplification of Support Vector Solutions , 2002, J. Mach. Learn. Res..

[20]  Vojislav Kecman,et al.  Support vectors selection by linear programming , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[21]  Tu Bao Ho,et al.  An efficient method for simplifying support vector machines , 2005, ICML.

[22]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[23]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[24]  Alexander J. Smola,et al.  Minimal Kernel Classifiers , 2002, J. Mach. Learn. Res..

[25]  Sayan Mukherjee,et al.  Support Vector Method for Multivariate Density Estimation , 1999, NIPS.

[26]  Michael E. Tipping,et al.  Fast Marginal Likelihood Maximisation for Sparse Bayesian Models , 2003 .