Interior-Point Methods for Massive Support Vector Machines

We investigate the use of interior-point methods for solving quadratic programming problems with a small number of linear constraints, where the quadratic term consists of a low-rank update to a positive semidefinite matrix. Several formulations of the support vector machine fit into this category. An interesting feature of these particular problems is the volume of data, which can lead to quadratic programs with between 10 and 100 million variables and, if written explicitly, a dense Q matrix. Our code is based on OOQP, an object-oriented interior-point code, with the linear algebra specialized for the support vector machine application. For the targeted massive problems, all of the data is stored out of core and we overlap computation and input/output to reduce overhead. Results are reported for several linear support vector machine formulations demonstrating that the method is reliable and scalable.

[1]  J. H. Wilkinson The algebraic eigenvalue problem , 1966 .

[2]  Olvi L. Mangasarian,et al.  Nonlinear Programming , 1969 .

[3]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[4]  Gene H. Golub,et al.  Matrix computations , 1983 .

[5]  Mei Han An,et al.  accuracy and stability of numerical algorithms , 1991 .

[6]  Michael C. Ferris,et al.  Finite termination of the proximal point algorithm , 1991, Math. Program..

[7]  Sanjay Mehrotra,et al.  On the Implementation of a Primal-Dual Interior Point Method , 1992, SIAM J. Optim..

[8]  A. Fischer A special newton-type optimization method , 1992 .

[9]  S. Dirkse,et al.  Mcplib: a collection of nonlinear mixed complementarity problems , 1995 .

[10]  Jacek Gondzio,et al.  Multiple centrality corrections in a primal-dual method for linear programming , 1996, Comput. Optim. Appl..

[11]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[12]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[13]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[14]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[15]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[16]  Olvi L. Mangasarian,et al.  Generalized Support Vector Machines , 1998 .

[17]  Wu Li,et al.  The Linear l1 Estimator and the Huber M-Estimator , 1998, SIAM J. Optim..

[18]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[19]  David R. Musicant,et al.  Successive overrelaxation for support vector machines , 1999, IEEE Trans. Neural Networks.

[20]  Michael C. Ferris,et al.  Feasible descent algorithms for mixed complementarity problems , 1999, Math. Program..

[21]  Johannes Gehrke,et al.  A framework for measuring changes in data characteristics , 1999, PODS '99.

[22]  David R. Musicant,et al.  Active Support Vector Machine Classification , 2000, NIPS.

[23]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[24]  A. Winsor Sampling techniques. , 2000, Nursing times.

[25]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[26]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[27]  David R. Musicant,et al.  Robust Linear and Support Vector Regression , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Stephen J. Wright On Reduced Convex QP Formulations of Monotone LCP Problems , 2000 .

[29]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[30]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[31]  Stephen J. Wright,et al.  Object-oriented software for quadratic programming , 2003, TOMS.

[32]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[33]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.