A fast iterative single data approach to training unconstrained least squares support vector machines

Abstract Least squares support vector machines (LS-SVMs) express the training in terms of solving a system of linear equations or an equivalent quadratic program (QP) with one linear equality constraint, in contrast to a QP with lower and upper bounds and one linear equality constraint for conventional support vector machines (SVMs). But for large scale problems, the presence of the linear equality constraint impedes the applications of some well developed methods. In this paper, we first eliminate the linear equality constraint of the QP in training LS-SVM and make it an unconstrained one, then propose a fast iterative single data approach with stepsize acceleration to the unconstrained QP. As a result of combining the selection rule of variables with the coordinate descent approach, the proposed approach is superior to the successive over-relaxation (SOR) method. Meanwhile updating only one variable at each iteration makes the proposed approach simpler and more flexible than the sequential minimal optimization (SMO) method. Computational experiment results on several benchmark data sets show that the proposed approach is more efficient than the existing single data approach and the SMO methods.

[1]  Michael Biehl,et al.  The AdaTron: An Adaptive Perceptron Algorithm , 1989 .

[2]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[3]  Kang Li,et al.  Improved conjugate gradient implementation for least squares support vector machines , 2012, Pattern Recognit. Lett..

[4]  B. A. Carré,et al.  The Determination of the Optimum Accelerating Factor for Successive Over-relaxation , 1961, Comput. J..

[5]  Wei Chu,et al.  An improved conjugate gradient scheme to the solution of least squares SVM , 2005, IEEE Transactions on Neural Networks.

[6]  Vojislav Kecman,et al.  On the equality of kernel AdaTron and sequential minimal optimization in classification and regression tasks and alike algorithms for kernel machines , 2003, ESANN.

[7]  George W. Irwin,et al.  Fast automatic two-stage nonlinear model identification based on the extreme learning machine , 2011, Neurocomputing.

[8]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[9]  Bernhard Schölkopf,et al.  Extracting Support Data for a Given Task , 1995, KDD.

[10]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[11]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[12]  S. Sathiya Keerthi,et al.  Developing parallel sequential minimal optimization for fast training support vector machine , 2006, Neurocomputing.

[13]  Apostolos Hadjidimos,et al.  Accelerated overrelaxation method , 1978 .

[14]  Johan A. K. Suykens,et al.  Least squares support vector machine classifiers: a large scale algorithm , 1999 .

[15]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[16]  Jyh-Horng Jeng,et al.  Three-parameter sequential minimal optimization for support vector machines , 2011, Neurocomputing.

[17]  Qinyu. Zhu Extreme Learning Machine , 2013 .

[18]  Helene E. Kulsrud,et al.  A practical technique for the determination of the optimum relaxation factor of the successive over-relaxation method , 1961, Commun. ACM.

[19]  Johan A. K. Suykens,et al.  First and Second Order SMO Algorithms for LS-SVM Classifiers , 2011, Neural Processing Letters.

[20]  Jason Weston,et al.  A user's guide to support vector machines. , 2010, Methods in molecular biology.

[21]  Louis A. Hageman,et al.  Iterative Solution of Large Linear Systems. , 1971 .

[22]  Johan A. K. Suykens,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2004, Machine Learning.

[23]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[24]  Nello Cristianini,et al.  The Kernel-Adatron Algorithm: A Fast and Simple Learning Procedure for Support Vector Machines , 1998, ICML.

[25]  Shiliang Sun,et al.  A review of optimization methodologies in support vector machines , 2011, Neurocomputing.

[26]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[27]  David R. Musicant,et al.  Successive overrelaxation for support vector machines , 1999, IEEE Trans. Neural Networks.

[28]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[29]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[30]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[31]  Johan A. K. Suykens,et al.  Sparse approximation using least squares support vector machines , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[32]  Kok Seng Chua,et al.  Efficient computations for large least square support vector machine classifiers , 2003, Pattern Recognit. Lett..

[33]  Gene H. Golub,et al.  Matrix computations , 1983 .

[34]  Haibo He,et al.  Compact Extreme Learning Machines for biological systems , 2010, Int. J. Comput. Biol. Drug Des..

[35]  Bernhard Schölkopf,et al.  Comparison of View-Based Object Recognition Algorithms Using Realistic 3D Models , 1996, ICANN.

[36]  Kang Li,et al.  A two-stage algorithm for identification of nonlinear dynamic systems , 2006, Autom..

[37]  V. Kecman,et al.  Iterative Single Data Algorithm for Training Kernel Machines from Huge Data Sets: Theory and Performance , 2005 .

[38]  S. Keerthi,et al.  SMO Algorithm for Least-Squares SVM Formulations , 2003, Neural Computation.

[39]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[40]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[41]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[42]  Michael Vogt,et al.  SMO Algorithms for Support Vector Machines without Bias Term , 2002 .

[43]  R. Fletcher Practical Methods of Optimization , 1988 .

[44]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[45]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[46]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .