Optimizing resources in model selection for support vector machines

Tuning SVM kernel parameters is a an important step for achieving a high-performing learning machine. The usual automatic methods used to tune these parameters require an inversion of the Gram-Schmidt matrix or a resolution of an extra quadratic programming problem. In the case of a large dataset these methods require the addition of huge amounts of memory and a long CPU time to the already significant resources used in the SVM training. In this paper, we propose a fast method based on an approximation of the gradient of the empirical error along with incremental learning, which reduces the resources required both in terms of processing time and of storage space.

[1]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[2]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[3]  Stefan Rüping,et al.  Incremental Learning with Support Vector Machines , 2001, ICDM.

[4]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[5]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[6]  Yoshua Bengio,et al.  Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[7]  Vladimir Vapnik,et al.  Principles of Risk Minimization for Learning Theory , 1991, NIPS.

[8]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[9]  N. E. Ayat Sélection automatique de modèle dans les machines à vecteurs de support : application à la reconnaissance d'images de chiffres manuscrits , 2004 .

[10]  Ching Y. Suen,et al.  Automatic model selection for the optimization of SVM kernels , 2005, Pattern Recognit..

[11]  David Haussler,et al.  Probabilistic kernel regression models , 1999, AISTATS.

[12]  Manfred Opper,et al.  Advances in large margin classifiers , 2000 .

[13]  Thorsten Joachims,et al.  Estimating the Generalization Performance of an SVM Efficiently , 2000, ICML.

[14]  Grace Wahba,et al.  Generalized Approximate Cross Validation For Support Vector Machines, Or, Another Way To Look At Mar , 1999 .

[15]  Jian-xiong Dong,et al.  Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Chih-Hung Wu,et al.  A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy , 2007, Expert Syst. Appl..

[17]  Ole Winther,et al.  Mean Field Methods for Classification with Gaussian Processes , 1998, NIPS.

[18]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[19]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[20]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[21]  Eddy Mayoraz,et al.  Improved Pairwise Coupling Classification with Correcting Classifiers , 1998, ECML.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  V. Vapnik,et al.  Bounds on Error Expectation for Support Vector Machines , 2000, Neural Computation.

[24]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[25]  Ching Y. Suen,et al.  Empirical error based optimization of SVM kernels: application to digit image recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[26]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[27]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[28]  G. Baudat,et al.  Kernel-based methods and function approximation , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[29]  Grace Wahba,et al.  Margin-like quantities and generalized approximate cross validation for support vector machines , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[30]  P. Bartlett,et al.  Gaussian Processes and SVM: Mean Field and Leave-One-Out , 2000 .

[31]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[32]  Chih-Jen Lin,et al.  Radius Margin Bounds for Support Vector Machines with the RBF Kernel , 2002, Neural Computation.

[33]  Dong Xiang,et al.  The Bias-Variance Tradeoff and the Randomized GACV , 1998, NIPS.