Stochastic Gradient Twin Support Vector Machine for Large Scale Problems

Abstract Within the large scale classification problem, the stochastic gradient descent method called PEGASOS has been successfully applied to support vector machines (SVMs). In this paper, we propose a stochastic gradient twin support vector machine (SGTSVM) based on the twin support vector machine (TWSVM). Compared to PEGASOS, our method is insensitive to stochastic sampling. Furthermore, we prove the convergence of SGTSVM and the approximation between TWSVM and SGTSVM under uniform sampling, whereas PEGASOS is almost surely convergent and only has an opportunity to obtain an approximation to SVM. In addition, we extend SGTSVM to nonlinear classification problems via a kernel trick. Experiments on artificial and publicly available datasets show that our method has stable performance and can handle large scale problems easily.

[1]  Yong Shi,et al.  ν-Nonparallel support vector machine for pattern classification , 2014, Neural Computing and Applications.

[2]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[3]  Bernhard Schölkopf,et al.  Support vector channel selection in BCI , 2004, IEEE Transactions on Biomedical Engineering.

[4]  Dewei Li,et al.  Deep Twin Support Vector Machine , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[5]  Claudio Sartori,et al.  A novel Frank-Wolfe algorithm. Analysis and applications to large-scale SVM training , 2013, Inf. Sci..

[6]  Reshma Khemchandani,et al.  Optimal kernel selection in twin support vector machines , 2009, Optim. Lett..

[7]  Yong Shi,et al.  Twin support vector machine with Universum data , 2012, Neural Networks.

[8]  Yuh-Jye Lee,et al.  RSVM: Reduced Support Vector Machines , 2001, SDM.

[9]  Chunhua Zhang,et al.  The new interpretation of support vector machines on statistical learning theory , 2010 .

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[12]  Yuan-Hai Shao,et al.  MLTSVM: A novel twin support vector machine to multi-label learning , 2016, Pattern Recognit..

[13]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[14]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[15]  Yong Shi,et al.  Laplacian twin support vector machine for semi-supervised classification , 2012, Neural Networks.

[16]  W. Rudin Principles of mathematical analysis , 1964 .

[17]  Madan Gopal,et al.  Application of smoothing technique on twin support vector machines , 2008, Pattern Recognit. Lett..

[18]  C. M. Shetty,et al.  Nonlinear Programming - Theory and Algorithms, Second Edition , 1993 .

[19]  Xiaodong Li,et al.  Designing benchmark problems for large-scale continuous optimization , 2015, Inf. Sci..

[20]  Theodore B. Trafalis,et al.  Support vector machine for regression and applications to financial forecasting , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[21]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[22]  Krzysztof Sopyla,et al.  Stochastic Gradient Descent with Barzilai-Borwein update step for SVM , 2015, Inf. Sci..

[23]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[24]  Gene H. Golub,et al.  Matrix computations , 1983 .

[25]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[26]  Jinbo Bi,et al.  Learning with Rigorous Support Vector Machines , 2003, COLT.

[27]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[28]  David G. Stork,et al.  Pattern Classification , 1973 .

[29]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[30]  David R. Musicant,et al.  Successive overrelaxation for support vector machines , 1999, IEEE Trans. Neural Networks.

[31]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[32]  Xinjun Peng,et al.  TPMSVM: A novel twin parametric-margin support vector machine for pattern recognition , 2011, Pattern Recognit..

[33]  Yuan-Hai Shao,et al.  A coordinate descent margin based-twin support vector machine for classification , 2012, Neural Networks.

[34]  Olvi L. Mangasarian,et al.  Nonlinear Programming , 1969 .

[35]  David Valiente,et al.  A modified stochastic gradient descent algorithm for view-based SLAM using omnidirectional images , 2014, Inf. Sci..

[36]  SartoriClaudio,et al.  A novel Frank-Wolfe algorithm. Analysis and applications to large-scale SVM training , 2014 .

[37]  Ambuj Tewari,et al.  On the Generalization Ability of Online Strongly Convex Programming Algorithms , 2008, NIPS.

[38]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[39]  Lan Bai,et al.  Twin Support Vector Machine for Clustering , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Yong Shi,et al.  Successive Overrelaxation for Laplacian Support Vector Machine , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Yuan-Hai Shao,et al.  Improvements on Twin Support Vector Machines , 2011, IEEE Transactions on Neural Networks.

[42]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[43]  Yingjie Tian,et al.  Large-scale linear nonparallel support vector machine solver , 2014, Neurocomputing.

[44]  William Stafiord Noble,et al.  Support vector machine applications in computational biology , 2004 .

[45]  Jean-Marie Monnez,et al.  Almost sure convergence of a stochastic approximation process in a convex set , 2007 .

[46]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[47]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Yuan-Hai Shao,et al.  Proximal parametric-margin support vector classifier and its applications , 2012, Neural Computing and Applications.

[49]  Yong Shi,et al.  Robust twin support vector machine for pattern classification , 2013, Pattern Recognit..

[50]  Yuan-Hai Shao,et al.  A GA-based model selection for smooth twin parametric-margin support vector machine , 2013, Pattern Recognit..

[51]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[52]  Marcelo Alencar,et al.  Online learning early skip decision method for the HEVC Inter process using the SVM-based Pegasos algorithm , 2016 .

[53]  Davide Anguita,et al.  Big Data Analytics in the Cloud: Spark on Hadoop vs MPI/OpenMP on Beowulf , 2015, INNS Conference on Big Data.

[54]  Yuan-Hai Shao,et al.  An efficient weighted Lagrangian twin support vector machine for imbalanced data classification , 2014, Pattern Recognit..

[55]  Yuan-Hai Shao,et al.  Nonparallel hyperplane support vector machine for binary classification problems , 2014, Inf. Sci..

[56]  Bernhard Schölkopf,et al.  Support Vector Machine Applications in Computational Biology , 2004 .

[57]  Laura Schweitzer,et al.  Advances In Kernel Methods Support Vector Learning , 2016 .

[58]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[59]  Wei Xu,et al.  Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent , 2011, ArXiv.