Improving ESVM with Generalized Cross-Validation

ELM works for the “generalized” single-hidden layer feedforward networks (SLFNs) but the hidden layer (or called feature mapping) in ELM needs not be tuned. Extreme Support Vector Machine (ESVM), combining Support Vector Machine (SVM) and Extreme Learning Machine (ELM) kernels, can lead to a better prediction capability. ESVM can usually have a relatively good predictive capability, and its training time is shorter than SVM most of the time. However, the estimation of regularization parameter of ESVM is very time-consuming. Moreover, the effects of the variance of hidden layer weights and the number of hidden neurons on ESVM are still unclear. Generalized Cross-Validation (GCV) has been widely used in statistics because it can efficiently estimate the ridge parameter without estimating the variance of errors. In this work, we study a connection between ESVM and GCV. Specifically, we consider the computation of the separating plane in ESVM as a ridge regression problem, and propose to use GCV to estimate the regularization parameter of ESVM. Experimental results show that GCV can significantly improve the efficiency of ESVM without accuracy lost. Also, the regularization parameter estimated by GCV can help to analyze how the variance of hidden layer weights and the number of hidden neurons affect the performance of ESVM.

[1]  L. Firinguetti A generalized ridge regression estimator and its finite sample properties , 1999 .

[2]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[3]  Gene H. Golub,et al.  Generalized cross-validation as a method for choosing a good ridge parameter , 1979, Milestones in Matrix Computation.

[4]  Panos J. Antsaklis,et al.  A simple method to derive bounds on the size and to train multilayer neural networks , 1991, IEEE Trans. Neural Networks.

[5]  On the almost unbiased ridge regression estimator , 1988 .

[6]  Arthur E. Hoerl,et al.  Application of ridge analysis to regression problems , 1962 .

[7]  Benoît Frénay,et al.  Using SVMs with randomised feature spaces: an extreme learning approach , 2010, ESANN.

[8]  Fuzhen Zhuang,et al.  A parallel incremental extreme SVM classifier , 2011, Neurocomputing.

[9]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[10]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[11]  J. Lawless,et al.  A simulation study of ridge and other regression estimators , 1976 .

[12]  Jaakko Riihimäki,et al.  A Connection between Extreme Learning Machine and Neural Network Kernel , 2010, IC3K.

[13]  Qing He,et al.  Extreme Support Vector Machine Classifier , 2008, PAKDD.

[14]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[16]  Alston S. Householder,et al.  The Theory of Matrices in Numerical Analysis , 1964 .

[17]  M. A. Alkhamisi,et al.  A Monte Carlo Study of Recent Ridge Parameters , 2007, Commun. Stat. Simul. Comput..

[18]  B. M. Kibria,et al.  Performance of Some New Ridge Regression Estimators , 2003 .

[19]  Hongming Zhou,et al.  Optimization method based extreme learning machine for classification , 2010, Neurocomputing.