Training a Support Vector Machine in the Primal

Most literature on support vector machines (SVMs) concentrates on the dual optimization problem. In this letter, we point out that the primal problem can also be solved efficiently for both linear and nonlinear SVMs and that there is no reason for ignoring this possibility. On the contrary, from the primal point of view, new families of algorithms for large-scale SVM training can be investigated.

[1]  G. Wahba,et al.  A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines , 1970 .

[2]  Leslie Greengard,et al.  A fast algorithm for particle simulations , 1987 .

[3]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[4]  J. Shewchuk An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[5]  R. Schaback Creating Surfaces from Scattered Data Using Radial Basis Functions , 1995 .

[6]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[7]  Federico Girosi,et al.  Support Vector Machines: Training and Applications , 1997 .

[8]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[9]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[10]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[11]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[12]  Nello Cristianini,et al.  Advances in Kernel Methods - Support Vector Learning , 1999 .

[13]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[14]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[15]  Jason Weston,et al.  Vicinal Risk Minimization , 2000, NIPS.

[16]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Andrew W. Moore,et al.  'N-Body' Problems in Statistical Learning , 2000, NIPS.

[19]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[20]  David R. Musicant,et al.  Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[21]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.

[22]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[23]  Yuh-Jye Lee,et al.  RSVM: Reduced Support Vector Machines , 2001, SDM.

[24]  Samy Bengio,et al.  A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[25]  Olvi L. Mangasarian,et al.  A finite newton method for classification , 2002, Optim. Methods Softw..

[26]  Yves Grandvalet,et al.  Adaptive Scaling for Feature Selection in SVMs , 2002, NIPS.

[27]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[28]  Ingo Steinwart,et al.  Sparseness of Support Vector Machines , 2003, J. Mach. Learn. Res..

[29]  Yiming Yang,et al.  Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization , 2003, ICML.

[30]  Larry S. Davis,et al.  Efficient Kernel Machines Using the Improved Fast Gauss Transform , 2004, NIPS.

[31]  Jason Weston,et al.  Breaking SVM Complexity with Cross-Training , 2004, NIPS.

[32]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[33]  G. Fasshauer Meshfree Methods , 2004 .

[34]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[35]  Nando de Freitas,et al.  Empirical Testing of Fast Kernel Density Estimation Algorithms , 2005 .

[36]  S. Sathiya Keerthi,et al.  A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[37]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[38]  Cheng Soon Ong Kernels: Regularization and Optimization , 2005 .

[39]  Nando de Freitas,et al.  Fast Krylov Methods for N-Body Learning , 2005, NIPS.

[40]  Andrew Y. Ng,et al.  Fast Gaussian Process Regression using KD-Trees , 2005, NIPS.

[41]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[42]  N. Quirke,et al.  Modelling gas adsorption in amorphous nanoporous materials: The Handbook of Theoretical and Computational Nanotechnology , 2006 .

[43]  S. Sathiya Keerthi,et al.  Building Support Vector Machines with Reduced Classifier Complexity , 2006, J. Mach. Learn. Res..

[44]  Lorenzo Bruzzone,et al.  Semisupervised Classification of Hyperspectral Images by SVMs Optimized in the Primal , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[45]  Rick Archibald,et al.  Feature Selection and Classification of Hyperspectral Images With Support Vector Machines , 2007, IEEE Geoscience and Remote Sensing Letters.

[46]  Jie Li,et al.  Training robust support vector machine with smooth Ramp loss in the primal space , 2008, Neurocomputing.

[47]  Lorenzo Bruzzone,et al.  Classification of hyperspectral remote-sensing data with primal SVM for small-sized training dataset problem☆ , 2008 .

[48]  Michael R. Lyu,et al.  Robust Regularized Kernel Regression , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[49]  Dario Petri,et al.  Dynamic Compensation of Nonlinear Sensors by a Learning-From-Examples Approach , 2008, IEEE Transactions on Instrumentation and Measurement.

[50]  G. Cawley,et al.  Efficient approximate leave-one-out cross-validation for kernel logistic regression , 2008, Machine Learning.

[51]  Jianguo Sun,et al.  Robust support vector regression in the primal , 2008, Neural Networks.

[52]  Madan Gopal,et al.  Application of smoothing technique on twin support vector machines , 2008, Pattern Recognit. Lett..