Support Vector Machines for Regression: A Succinct Review of Large-Scale and Linear Programming Formulations

Support Vector-based learning methods are an important part of Computational Intelligence techniques. Recent efforts have been dealing with the problem of learning from very large datasets. This paper reviews the most commonly used formulations of support vector machines for regression (SVRs) aiming to emphasize its usability on large-scale applications. We review the general concept of support vector machines (SVMs), address the state-of-the-art on training methods SVMs, and explain the fundamental principle of SVRs. The most common learning methods for SVRs are introduced and linear programming-based SVR formulations are explained emphasizing its suitability for large-scale learning. Finally, this paper also discusses some open problems and current trends.

[1]  John C. Platt Using Analytic QP and Sparseness to Speed Training of Support Vector Machines , 1998, NIPS.

[2]  M. K. Luhandjula Studies in Fuzziness and Soft Computing , 2013 .

[3]  B. Schölkopf,et al.  Linear programs for automatic accuracy control in regression. , 1999 .

[4]  Don R. Hush,et al.  QP Algorithms with Guaranteed Accuracy and Run Time for Support Vector Machines , 2006, J. Mach. Learn. Res..

[5]  HighWire Press Philosophical Transactions of the Royal Society of London , 1781, The London Medical Journal.

[6]  Zhihua Cai,et al.  Using Support Vector Regression for Classification , 2008, ADMA.

[7]  Tomaso Poggio,et al.  Everything old is new again: a fresh look at historical approaches in machine learning , 2002 .

[8]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[9]  Li Zhang,et al.  On the sparseness of 1-norm support vector machines , 2010, Neural Networks.

[10]  A. V.DavidSánchez,et al.  Advanced support vector machines and kernel methods , 2003, Neurocomputing.

[11]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[12]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis: Pattern analysis , 2004 .

[13]  David A. Peterson,et al.  Model and feature selection in microarray classification , 2004, 2004 Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[14]  J. Mercer Functions of positive and negative type, and their connection with the theory of integral equations , 1909 .

[15]  R. Courant,et al.  Methods of Mathematical Physics , 1962 .

[16]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[17]  Suvrit Sra,et al.  Efficient Large Scale Linear Programming Support Vector Machines , 2006, ECML.

[18]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[19]  Hisham Al-Mubaid,et al.  A New Text Categorization Technique Using Distributional Clustering and Learning Logic , 2006, IEEE Transactions on Knowledge and Data Engineering.

[20]  Yanyan Xu,et al.  A New Optimization Method of Large-Scale SVMs Based on Kernel Distance Clustering , 2009, 2009 International Conference on Computational Intelligence and Software Engineering.

[21]  Ravi Sankar,et al.  Time Series Prediction Using Support Vector Machines: A Survey , 2009, IEEE Computational Intelligence Magazine.

[22]  Kim-Hui Yap,et al.  Fuzzy SVM for content-based image retrieval: a pseudo-label support vector machine framework , 2006, IEEE Computational Intelligence Magazine.

[23]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[24]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[25]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[26]  Zhao Lu,et al.  Linear programming support vector regression with wavelet kernel: A new approach to nonlinear dynamical systems identification , 2009, Math. Comput. Simul..

[27]  Juan Cota-Ruiz,et al.  An algorithm for training a large scale support vector machine for regression based on linear programming and decomposition methods , 2013, Pattern Recognit. Lett..

[28]  Bart Kosko,et al.  Neural networks for signal processing , 1992 .

[29]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[30]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[31]  Nello Cristianini,et al.  Support Vector Machines and Kernel Methods: The New Generation of Learning Machines , 2002, AI Mag..

[32]  Massimiliano Pontil,et al.  Support Vector Machines: Theory and Applications , 2001, Machine Learning and Its Applications.

[33]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[34]  Jose Gerardo Rosiles,et al.  Algorithms for training large-scale linear programming support vector regression and classification , 2011 .

[35]  Thomas Martinetz,et al.  Simple Method for High-Performance Digit Recognition Based on Sparse Coding , 2008, IEEE Transactions on Neural Networks.

[36]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[37]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[38]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[39]  David R. Musicant,et al.  Large Scale Kernel Regression via Linear Programming , 2002, Machine Learning.

[40]  Shigeo Abe,et al.  Decomposition techniques for training linear programming support vector machines , 2009, Neurocomputing.

[41]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[42]  Andrzej Stachurski,et al.  Parallel Optimization: Theory, Algorithms and Applications , 2000, Parallel Distributed Comput. Pract..

[43]  Clifford Hildreth,et al.  A quadratic programming procedure , 1957 .

[44]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..