A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

This paper develops a fast method for solving linear SVMs with L2 loss function that is suited for large scale data mining tasks such as text classification. This is done by modifying the finite Newton method of Mangasarian in several ways. Experiments indicate that the method is much faster than decomposition methods such as SVMlight, SMO and BSVM (e.g., 4-100 fold), especially when the number of examples is large. The paper also suggests ways of extending the method to other loss functions such as the modified Huber's loss function and the L1 loss function, and also for solving ordinal regression.

[1]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[2]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[3]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[4]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[5]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[6]  Andreas Frommer,et al.  Fast CG-Based Methods for Tikhonov-Phillips Regularization , 1999, SIAM J. Sci. Comput..

[7]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[8]  Kiri Wagstaff,et al.  Alpha seeding for support vector machines , 2000, KDD '00.

[9]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[10]  Olvi L. Mangasarian,et al.  A finite newton method for classification , 2002, Optim. Methods Softw..

[11]  Chih-Jen Lin,et al.  Decomposition Methods for Linear Support Vector Machines , 2003, Neural Computation.

[12]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[13]  Yiming Yang,et al.  Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization , 2003, ICML.

[14]  Shourya Roy,et al.  Fast and accurate text classification via multiple linear discriminant projections , 2003, The VLDB Journal.

[15]  Andrew W. Moore,et al.  Logistic regression for data mining and high-dimensional classification , 2004 .

[16]  Chih-Jen Lin,et al.  A Simple Decomposition Method for Support Vector Machines , 2002, Machine Learning.

[17]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[18]  Wei Chu,et al.  New approaches to support vector ordinal regression , 2005, ICML.

[19]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .