A Heuristic for Free Parameter Optimization with Support Vector Machines

A heuristic is proposed to address free parameter selection for Support Vector Machines, with the goals of improving generalization performance and providing greater insensitivity to training set selection. The many local extrema in these optimization problems make gradient descent algorithms impractical. The main point of the proposed heuristic is the inclusion of a model complexity measure to improve generalization performance. We also use simulated annealing to improve parameter search efficiency compared to an exhaustive grid search, and include an intensity-weighted centre of mass of the most optimum points to reduce volatility. We examine two standard classification problems for comparison, and apply the heuristic to bioinformatics and retinal electrophysiology classification.

[1]  Erich E. Sutter,et al.  The field topography of ERG components in man—I. The photopic luminance response , 1992, Vision Research.

[2]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[3]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[4]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[5]  Kristin P. Bennett,et al.  A Pattern Search Method for Model Selection of Support Vector Regression , 2002, SDM.

[6]  De-shuang Huang,et al.  Optimisation of radial basis function classifiers using simulated annealing algorithm for cancer classification , 2005 .

[7]  Yunqian Ma,et al.  Practical selection of SVM parameters and noise estimation for SVM regression , 2004, Neural Networks.

[9]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[10]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[11]  K. Lebart,et al.  A stochastic optimization approach for parameter tuning of support vector machines , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[12]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[13]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[14]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[15]  Charles P. Staelin Parameter selection for support vector machines , 2002 .

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[18]  Evangelos E. Milios,et al.  Automatic recognition of regions of intrinsically poor multiple alignment using machine learning , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[19]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[20]  Marc Leman,et al.  A Simulated Annealing Optimization of Audio Features for Drum Classification , 2005, ISMIR.

[21]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.