Finite Newton method for Lagrangian support vector machine classification

Abstract An implicit Lagrangian [Math. Programming Ser. B 62 (1993) 277] formulation of a support vector machine classifier that led to a highly effective iterative scheme [J. Machine Learn. Res. 1 (2001) 161] is solved here by a finite Newton method. The proposed method, which is extremely fast and terminates in 6 or 7 iterations, can handle classification problems in very high dimensional spaces, e.g. over 28,000, in a few seconds on a 400 MHz Pentium II machine. The method can also handle problems with large datasets and requires no specialized software other than a commonly available solver for a system of linear equations. Finite termination of the proposed method is established in this work.

[1]  Francisco Facchinei,et al.  Minimization of SC1 functions and the Maratos effect , 1995, Oper. Res. Lett..

[2]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[3]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[4]  Yuh-Jye Lee,et al.  RSVM: Reduced Support Vector Machines , 2001, SDM.

[5]  J. Hiriart-Urruty,et al.  Generalized Hessian matrix and second-order optimality conditions for problems withC1,1 data , 1984 .

[6]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[7]  O. Mangasarian Parallel Gradient Distribution in Unconstrained Optimization , 1995 .

[8]  Olvi L. Mangasarian,et al.  Generalized Support Vector Machines , 1998 .

[9]  L. Armijo Minimization of functions having Lipschitz continuous first partial derivatives. , 1966 .

[10]  S. Venit,et al.  Numerical Analysis: A Second Course. , 1974 .

[11]  Susan Eitelman,et al.  Matlab Version 6.5 Release 13. The MathWorks, Inc., 3 Apple Hill Dr., Natick, MA 01760-2098; 508/647-7000, Fax 508/647-7001, www.mathworks.com , 2003 .

[12]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[13]  Olvi L. Mangasarian,et al.  A Finite Newton Method for Classi cation Problems , 2001 .

[14]  David R. Musicant,et al.  Successive overrelaxation for support vector machines , 1999, IEEE Trans. Neural Networks.

[15]  David R. Musicant,et al.  Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[16]  Olvi L. Mangasarian,et al.  A finite newton method for classification , 2002, Optim. Methods Softw..

[17]  J. Miller Numerical Analysis , 1966, Nature.

[18]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[19]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[20]  J. Ortega Numerical Analysis: A Second Course , 1974 .

[21]  R. Rockafellar Augmented Lagrange Multiplier Functions and Duality in Nonconvex Programming , 1974 .

[22]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[23]  Gene H. Golub,et al.  Matrix computations , 1983 .

[24]  Olvi L. Mangasarian,et al.  Nonlinear Programming , 1969 .

[25]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[26]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[27]  Glenn Fung,et al.  Incremental Support Vector Machine Classification , 2002, SDM.

[28]  Olvi L. Mangasarian,et al.  Nonlinear complementarity as unconstrained and constrained minimization , 1993, Math. Program..

[29]  David Page Comparative Data Mining for Microarrays : A Case Study Based on Multiple Myeloma , 2002 .

[30]  S. Odewahn,et al.  Automated star/galaxy discrimination with neural networks , 1992 .

[31]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[32]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[33]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .