Sparse and heuristic support vector machine for binary classifier and regressor fusion

Single-hidden layer feedforward networks (SLFNs) are always viewed as classical methods for binary classification and regression. There are several variant types of SLFNs, such as support vector machines (SVM), extreme learning machines (ELM), etc. It is an open problem for SLFNs to obtain a powerful feature mapper with a simple network structure. In this paper, we propose a framework called sparse and heuristic SVM (SH-SVM) to fuse different SLFNs from the aspect of feature mapping to obtain powerful feature mapping capability and improve the generalization performance. By fusing different SLFNs, the SH-SVM benefits from the learning capabilities of each model. As an example, the fusion of SVM and ELM is studied in detail. Then with the sparse representation method, a compact SLFN is obtained and the most powerful hidden nodes are selected. Furthermore, an efficient method for solving the sparse representation problem in SH-SVM is proposed. Experiments on 25 data sets with eight methods show that SH-SVM has satisfactory results with a compact network structure.

[1]  Amaury Lendasse,et al.  OP-ELM: Optimally Pruned Extreme Learning Machine , 2010, IEEE Transactions on Neural Networks.

[2]  Shun-Feng Su,et al.  Robust support vector regression networks for function approximation with outliers , 2002, IEEE Trans. Neural Networks.

[3]  Michael Elad,et al.  Coordinate and subspace optimization methods for linear least squares with non-quadratic regularization , 2007 .

[4]  H. White,et al.  An additional hidden unit test for neglected nonlinearity in multilayer feedforward networks , 1989, International 1989 Joint Conference on Neural Networks.

[5]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[6]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[7]  Bernhard Schölkopf,et al.  Incorporating Invariances in Support Vector Learning Machines , 1996, ICANN.

[8]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[9]  Guang-Bin Huang,et al.  An Insight into Extreme Learning Machines: Random Neurons, Random Features and Kernels , 2014, Cognitive Computation.

[10]  Jun Miao,et al.  Constrained Extreme Learning Machine: A novel highly discriminative random feedforward neural network , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[11]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[12]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Yaonan Wang,et al.  Bidirectional Extreme Learning Machine for Regression Problem and Its Learning Effectiveness , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Feilong Cao,et al.  A study on effectiveness of extreme learning machine , 2011, Neurocomputing.

[16]  Shannon L. Risacher,et al.  Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance , 2011, 2011 International Conference on Computer Vision.

[17]  Ivor W. Tsang,et al.  Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets , 2010, ICML.

[18]  Zhixiang Chen,et al.  A modified extreme learning machine with sigmoidal activation functions , 2012, Neural Computing and Applications.

[19]  Jesse L. Barlow,et al.  More Accurate Bidiagonal Reduction for Computing the Singular Value Decomposition , 2001, SIAM J. Matrix Anal. Appl..

[20]  Danwei Wang,et al.  Sparse Extreme Learning Machine for Classification , 2014, IEEE Transactions on Cybernetics.

[21]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .

[22]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[23]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[24]  H. Wolkowicz,et al.  Bounds for eigenvalues using traces , 1980 .

[25]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[26]  Stephen P. Boyd,et al.  Applications of second-order cone programming , 1998 .

[27]  Yuan Lan,et al.  Two-stage extreme learning machine for regression , 2010, Neurocomputing.

[28]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[29]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[30]  Mark van Heeswijk,et al.  Binary/ternary extreme learning machines , 2015, Neurocomputing.

[31]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  Dejan J. Sobajic,et al.  Learning and generalization characteristics of the random vector Functional-link net , 1994, Neurocomputing.

[34]  Stephen P. Boyd,et al.  Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.

[35]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[36]  Wei Zhou,et al.  An Efficient Implementation of Modified Regularized Sparse Recovery for Real-Time Optical Power Monitoring , 2012, Journal of Lightwave Technology.

[37]  Dietrich Klakow,et al.  A Neural Network approach for mixing language models , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[39]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[40]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[41]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..