Sparse LS-SVM two-steps model selection method

Least Square Support Vector Machine (LS-SVM) converts the hinge loss function of SVM into a least square loss function which simplified the original quadratic programming training method to a linear system solving problem. Sparse LS-SVM is obtained with a pruning procedure. The performance of sparse LS-SVM depends on the selection of hyper-parameters (i.e. kernel and penalty parameters). Currently, CV and LOO are the most common methods to select hyper-parameters for LS-SVM. However, CV is computationally expensive while LOO yields a high variance of validation error which may mislead the selection of hyper-parameters. Selecting both kernel and penalty parameters simultaneously needs to search in a high dimensional parameter space. In this work, we propose a new two-step hyper-parameter selection method. Distance between Two Classes (DBTC) method is adopted to select the kernel parameters based on a maximization of between-class separation of the projected samples in the feature space. However, data distribution could not be helpful in penalty parameter selection. Therefore, we propose to select the penalty parameter via a minimization of a Localized Generalization Error to enhance the generalization capability of the LS-SVM. Experimental results comparing to existing methods show the proposed two-step method yields better LS-SVMs in term of average testing accuracies.

[1]  Johan A. K. Suykens,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2004, Machine Learning.

[2]  Johan A. K. Suykens,et al.  Sparse approximation using least squares support vector machines , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[3]  Jianping Li,et al.  Evolution strategies based adaptive Lp LS-SVM , 2011, Inf. Sci..

[4]  Xiang-Yan Zeng,et al.  SMO-based pruning methods for sparse least squares support vector machines , 2005, IEEE Transactions on Neural Networks.

[5]  Daniel S. Yeung,et al.  Localized Generalization Error Model and Its Application to Architecture Selection for Radial Basis Function Neural Network , 2007, IEEE Transactions on Neural Networks.

[6]  Daniel S. Yeung,et al.  Localized generalization error based active learning for image annotation , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[7]  Daming Shi,et al.  Sparse kernel learning with LASSO and Bayesian inference algorithm , 2010, Neural Networks.

[8]  Gavin C. Cawley,et al.  Leave-One-Out Cross-Validation Based Model Selection Criteria for Weighted LS-SVMs , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[9]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[10]  Tapio Pahikkala,et al.  Efficient cross-validation for kernelized least-squares regression with sparse basis expansions , 2012, Machine Learning.

[11]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[12]  Gavin C. Cawley,et al.  Preventing Over-Fitting during Model Selection via Bayesian Regularisation of the Hyper-Parameters , 2007, J. Mach. Learn. Res..

[13]  Gavin C. Cawley,et al.  Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers , 2003, Pattern Recognit..

[14]  Shangbing Gao,et al.  1-Norm least squares twin support vector machines , 2011, Neurocomputing.

[15]  K. Johana,et al.  Benchmarking Least Squares Support Vector Machine Classifiers , 2022 .

[16]  Xiaoyuan Zhang,et al.  Multi-class support vector machine optimized by inter-cluster distance and self-adaptive deferential evolution , 2012, Appl. Math. Comput..

[17]  X. C. Guo,et al.  A novel LS-SVMs hyper-parameter selection based on particle swarm optimization , 2008, Neurocomputing.

[18]  Yatong Zhou,et al.  Analysis of the Distance Between Two Classes for Tuning SVM Hyperparameters , 2010, IEEE Transactions on Neural Networks.

[19]  Antônio de Pádua Braga,et al.  IP-LSSVM: A two-step sparse classifier , 2009, Pattern Recognit. Lett..