An Effective Method of Pruning Support Vector Machine Classifiers

Support vector machine (SVM) classifiers often contain many SVs, which lead to high computational cost at runtime and potential overfitting. In this paper, a practical and effective method of pruning SVM classifiers is systematically developed. The kernel row vectors, with one-to-one correspondence to the SVs, are first organized into clusters. The pruning work is divided into two phases. In the first phase, orthogonal projections (OPs) are performed to find kernel row vectors that can be approximated by the others. In the second phase, the previously found vectors are removed, and crosswise propagations, which simply utilize the coefficients of OPs, are implemented within each cluster. The method circumvents the problem of explicitly discerning SVs in the high-dimensional feature space as the SVM formulation does, and does not involve local minima. With different parameters, 3000 experiments were run on the LibSVM software platform. After pruning 42% of the SVs, the average change in classification accuracy was only - 0.7%, and the average computation time for removing one SV was 0.006 of the training time. In some scenarios, over 90% of the SVs were pruned with less than 0.1% reduction in classification accuracy. The experiments demonstrate the existence of large numbers of superabundant SVs in trained SVMs, and suggest a synergistic use of training and pruning in practice. Many SVMs already used in applications could be upgraded by pruning nearly half of their SVs.

[1]  Xun Liang Method of digging tunnels into the error hypersurface , 1993, Neural Parallel Sci. Comput..

[2]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[3]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[4]  Tom Downs,et al.  Exact Simplification of Support Vector Solutions , 2002, J. Mach. Learn. Res..

[5]  Marimuthu Palaniswami,et al.  Incremental training of support vector machines , 2005, IEEE Transactions on Neural Networks.

[6]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[7]  Carl L. DeVito,et al.  Functional Analysis and Linear Operator Theory , 1990 .

[8]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[9]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[10]  A. Bojanczyk Complexity of Solving Linear Systems in Different Models of Computation , 1984 .

[11]  Chih-Jen Lin,et al.  On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization , 2007, IEEE Transactions on Neural Networks.

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  Alexander G. Gray,et al.  QUIC-SVD: Fast SVD Using Cosine Trees , 2008, NIPS.

[14]  Licheng Jiao,et al.  Fast Sparse Approximation for Least Squares Support Vector Machine , 2007, IEEE Transactions on Neural Networks.

[15]  R. A. Silverman,et al.  Introductory Real Analysis , 1972 .

[16]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[17]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Sergios Theodoridis,et al.  Online Kernel-Based Classification Using Adaptive Projection Algorithms , 2008, IEEE Transactions on Signal Processing.

[20]  Alexander J. Smola,et al.  Minimal Kernel Classifiers , 2002, J. Mach. Learn. Res..

[21]  S. Sathiya Keerthi,et al.  Building Support Vector Machines with Reduced Classifier Complexity , 2006, J. Mach. Learn. Res..

[22]  David R. Musicant,et al.  Active set support vector regression , 2004, IEEE Transactions on Neural Networks.

[23]  Xinyu Guo,et al.  Pruning Support Vector Machines Without Altering Performances , 2008, IEEE Transactions on Neural Networks.

[24]  Bernhard Schölkopf,et al.  Improving the accuracy and speed of support vector learning machines , 1997, NIPS 1997.

[25]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[26]  V. Vapnik,et al.  Bounds on Error Expectation for Support Vector Machines , 2000, Neural Computation.

[27]  Chih-Jen Lin,et al.  Decomposition Methods for Linear Support Vector Machines , 2003, Neural Computation.

[28]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[29]  F. L. Bauer Elimination with weighted row combinations for solving linear equations and least squares problems , 1965 .

[30]  Chih-Jen Lin,et al.  Asymptotic convergence of an SMO algorithm without any assumptions , 2002, IEEE Trans. Neural Networks.

[31]  I. Song,et al.  Working Set Selection Using Second Order Information for Training Svm, " Complexity-reduced Scheme for Feature Extraction with Linear Discriminant Analysis , 2022 .

[32]  Hong Shen,et al.  Application of online-training SVMs for real-time intrusion detection with different considerations , 2005, Comput. Commun..

[33]  Jun Guo,et al.  An Efficient Method for Simplifying Decision Functions of Support Vector Machines , 2006, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[34]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[35]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[36]  Massimiliano Pontil,et al.  Properties of Support Vector Machines , 1998, Neural Computation.

[38]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[39]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[40]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Chih-Jen Lin,et al.  A Study on SMO-Type Decomposition Methods for Support Vector Machines , 2006, IEEE Transactions on Neural Networks.

[42]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[43]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[44]  Federico Girosi,et al.  Reducing the run-time complexity of Support Vector Machines , 1999 .

[45]  Robert Tibshirani,et al.  The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..

[46]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[47]  Ming-Wei Chang,et al.  Leave-One-Out Bounds for Support Vector Regression Model Selection , 2005, Neural Computation.

[48]  Xiang-Yan Zeng,et al.  SMO-based pruning methods for sparse least squares support vector machines , 2005, IEEE Transactions on Neural Networks.