A Comparison of Pruning Algorithms for Sparse Least Squares Support Vector Machines

Least Squares Support Vector Machines (LS-SVM) is aproven method for classification and function approximation. In comparison to the standard Support Vector Machines (SVM) it only requires solving a linear system, but it lacks sparseness in the number of solution terms. Pruning can therefore be applied. Standard ways of pruning the LS-SVM consist of recursively solving the approximation problem and subsequently omitting data that have a small error in the previous pass and are based on support values. We suggest a slightly adapted variant that improves the performance significantly. We assess the relative regression performance of these pruning schemes in a comparison with two (for pruning adapted) subset selection schemes, -one based on the QR decomposition (supervised), one that searches the most representative feature vector span (unsupervised)-, random omission and backward selection on independent test sets in some benchmark experiments.

[1]  Alan J. Miller,et al.  Subset Selection in Regression , 1991 .

[2]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[3]  Gregory J. Wolff,et al.  Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[4]  Johan A. K. Suykens,et al.  Weighted least squares support vector machines: robustness and sparse approximation , 2002, Neurocomputing.

[5]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .

[6]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[7]  Joos Vandewalle,et al.  Special issue on fundamental and information processing aspects of neurocomputing , 2002, Neurocomputing.

[8]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[9]  G. Baudat,et al.  Kernel-based methods and function approximation , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[10]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[13]  Theo J. A. de Vries,et al.  Pruning error minimization in least squares support vector machines , 2003, IEEE Trans. Neural Networks.

[14]  Shang-Liang Chen,et al.  Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.

[15]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[16]  A. Atkinson Subset Selection in Regression , 1992 .

[17]  Johan A. K. Suykens,et al.  Subset based least squares subspace regression in RKHS , 2005, Neurocomputing.

[18]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .