论文信息 - A New Perspective on Convex Relaxations of Sparse SVM

A New Perspective on Convex Relaxations of Sparse SVM

This paper proposes a convex relaxation of a sparse support vector machine (SVM) based on the perspective relaxation of mixed-integer nonlinear programs. We seek to minimize the zero-norm of the hyperplane normal vector with a standard SVM hinge-loss penalty and extend our approach to a zeroone loss penalty. The relaxation that we propose is a second-order cone formulation that can be efficiently solved by standard conic optimization solvers. We compare the optimization properties and classification performance of the second-order cone formulation with previous sparse SVM formulations suggested in the literature.

[1] Edoardo Amaldi,et al. On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[2] Aharon Ben-Tal,et al. Lectures on modern convex optimization , 1987 .

[3] Bernhard Schölkopf,et al. Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[4] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[5] Kim-Chuan Toh,et al. SDPT3 -- A Matlab Software Package for Semidefinite Programming , 1996 .

[6] Stephen P. Boyd,et al. Recent Advances in Learning and Control , 2008, Lecture Notes in Control and Information Sciences.

[7] Ivor W. Tsang,et al. Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets , 2010, ICML.

[8] Blaine Nelson,et al. Support Vector Machines Under Adversarial Label Noise , 2011, ACML.

[9] Robert H. Sloan,et al. Proceedings of the 15th Annual Conference on Computational Learning Theory , 2002 .

[10] Kim-Chuan Toh,et al. A Newton-CG Augmented Lagrangian Method for Semidefinite Programming , 2010, SIAM J. Optim..

[11] Michael C. Ferris,et al. Semismooth support vector machines , 2004, Math. Program..

[12] Ali A. Ghorbani,et al. A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[13] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[14] Noam Goldberg,et al. Sparse weighted voting classifier selection and its linear programming relaxations , 2012, Inf. Process. Lett..

[15] Hans Ulrich Simon,et al. Robust Trainability of Single Neurons , 1995, J. Comput. Syst. Sci..

[16] O. Mangasarian,et al. Massive data discrimination via linear support vector machines , 2000 .

[17] Noam Goldberg,et al. Boosting Classifiers with Tightened L0-Relaxation Penalties , 2010, ICML.

[18] Kristin P. Bennett,et al. A Parametric Optimization Method for Machine Learning , 1997, INFORMS J. Comput..

[19] V. Koltchinskii,et al. Complexities of convex combinations and bounding the generalization error in classification , 2004, math/0405356.

[20] Oktay Günlük,et al. Perspective reformulations of mixed integer nonlinear programs with indicator variables , 2010, Math. Program..

[21] Glenn Fung,et al. A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[22] Paul S. Bradley,et al. Feature Selection via Mathematical Programming , 1997, INFORMS J. Comput..

[23] Nuno Vasconcelos,et al. Direct convex relaxations of sparse SVM , 2007, ICML '07.

[24] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[25] John Langford,et al. PAC-MDL Bounds , 2003, COLT.