Least squares support vector machine classifiers: a large scale algorithm

Support vector machines (SVM's) have been introduced in literature as a method for pattern recognition and function estimation, within the framework of statistical learning theory and structural risk minimization. A least squares version (LSSVM) has been recently reported which expresses the training in terms of solving a set of linear equations instead of quadratic programming as for the standard SVM case. In this paper we present an iterative training algorithm for LS-SVM's which is based on a conjugate gradient method. This enables solving large scale classification problems which is illustrated on a multi two-spiral benchmark problem. Keywords. Support vector machines, classification, neural networks, RBF kernels, conjugate gradient method.

[1]  R. Fletcher Practical Methods of Optimization , 1988 .

[2]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[3]  V. Vapnik The Support Vector Method of Function Estimation , 1998 .

[4]  Å. Björck,et al.  Solution of augmented linear systems using orthogonal factorizations , 1994 .

[5]  Bernhard Schölkopf,et al.  The connection between regularization operators and support vector kernels , 1998, Neural Networks.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[8]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[9]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[10]  R. Fletcher,et al.  On the Stability of Null-Space Methods for KKT Systems , 1997 .

[11]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[12]  Johan A. K. Suykens,et al.  Training multilayer perceptron classifiers based on a modified support vector method , 1999, IEEE Trans. Neural Networks.

[13]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[14]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[15]  Ji-guang Sun Structured backward errors for KKT systems , 1999 .

[16]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[17]  Federico Girosi,et al.  An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.

[18]  Sandro Ridella,et al.  Circular backpropagation networks for classification , 1997, IEEE Trans. Neural Networks.