OLLAWV: OnLine Learning Algorithm using Worst-Violators

Abstract Due to the ever-growing nature of dataset sizes, the need for scalable and accurate machine learning algorithms has become evident. Stochastic gradient descent methods are popular tools used to optimize large-scale learning problems because of their generalization performance, simplicity, and scalability. This paper proposes a novel stochastic, also known as online, learning algorithm for solving the L1 support vector machine (SVM) problem, named OnLine Learning Algorithm using Worst-Violators (OLLAWV). Unlike other stochastic methods, OLLAWV eliminates the need for specifying the maximum number of iterations and the use of a regularization term. OLLAWV uses early stopping for controlling the size of the margin instead of the regularization term. The experimental study, performed under very strict nested cross-validation (a.k.a., double resampling), evaluates and compares the performance of this proposal with state-of-the-art SVM kernel methods that have been shown to outperform traditional and widely used approaches for solving L1-SVMs such as Sequential Minimal Optimization. OLLAWV is also compared to 5 other traditional non-SVM algorithms. The results over 23 datasets show OLLAWV's superior performance in terms of accuracy, scalability, and model sparseness, making it suitable for large-scale learning.

[1]  Vojislav Kecman,et al.  Kernel Based Algorithms for Mining Huge Data Sets: Supervised, Semi-supervised, and Unsupervised Learning , 2006, Studies in Computational Intelligence.

[2]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[3]  Vojislav Kecman Fast online algorithm for nonlinear support vector machines and other alike models , 2016, Optical Memory and Neural Networks.

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[6]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[7]  Vojislav Kecman,et al.  Algorithms for direct L2 support vector machines , 2014, 2014 IEEE International Symposium on Innovations in Intelligent Systems and Applications (INISTA) Proceedings.

[8]  Ling Jian,et al.  A chunk updating LS-SVMs based on block Gaussian elimination method , 2017, Appl. Soft Comput..

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Alexander J. Smola,et al.  Online learning with kernels , 2001, IEEE Transactions on Signal Processing.

[11]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[12]  Vojislav Kecman,et al.  Multi-target support vector regression via correlation regressor chains , 2017, Inf. Sci..

[13]  Constantinos Panagiotakopoulos,et al.  The Stochastic Gradient Descent for the Primal L1-SVM Optimization Revisited , 2013, ECML/PKDD.

[14]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[15]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[16]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[17]  Alexander J. Smola,et al.  Large Margin Classification for Moving Targets , 2002, ALT.

[18]  Vojislav Kecman,et al.  Learning and Soft Computing: Support Vector Machines, Neural Networks, and Fuzzy Logic Models , 2001 .

[19]  Samy Bengio,et al.  Links between perceptrons, MLPs and SVMs , 2004, ICML.

[20]  Vojislav Kecman,et al.  Speeding up online training of L1 Support Vector Machines , 2016, SoutheastCon 2016.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Stéphane Lecoeuche,et al.  Application of an incremental SVM algorithm for on-line human recognition from video surveillance using texture and color features , 2014, Neurocomputing.

[23]  Ralf Herbrich,et al.  Learning Kernel Classifiers , 2001 .

[24]  Haydemar Núñez,et al.  Handling binary classification problems with a priority class by using Support Vector Machines , 2017, Appl. Soft Comput..

[25]  Shinq-Jen Wu,et al.  Two-phase optimization for support vectors and parameter selection of support vector machines: Two-class classification , 2017, Appl. Soft Comput..

[26]  Richard Simon,et al.  Bias in error estimation when using cross-validation for model selection , 2006, BMC Bioinformatics.

[27]  Ljiljana Zigic,et al.  Direct L2 Support Vector Machine , 2016 .

[28]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[29]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[30]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[31]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[32]  Bartosz Krawczyk,et al.  Multidimensional data classification with chordal distance based kernel and Support Vector Machines , 2015, Eng. Appl. Artif. Intell..

[33]  V. Kecman,et al.  Iterative Single Data Algorithm for Training Kernel Machines from Huge Data Sets: Theory and Performance , 2005 .

[34]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[35]  Bartosz Krawczyk,et al.  Ensemble of Extreme Learning Machines with trained classifier combination and statistical features for hyperspectral data , 2018, Neurocomputing.

[36]  Bartosz Krawczyk,et al.  Tackling label noise with multi-class decomposition using fuzzy one-class support vector machines , 2016, 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[37]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[38]  Vojislav Kecman,et al.  Geometric approach to support vector machines learning for large datasets , 2013 .

[39]  Vojislav Kecman,et al.  Fast online algorithms for Support Vector Machines , 2016, SoutheastCon 2016.