In-sample model selection for Support Vector Machines

In-sample model selection for Support Vector Machines is a promising approach that allows using the training set both for learning the classifier and tuning its hyperparameters. This is a welcome improvement respect to out-of-sample methods, like cross-validation, which require to remove some samples from the training set and use them only for model selection purposes. Unfortunately, in-sample methods require a precise control of the classifier function space, which can be achieved only through an unconventional SVM formulation, based on Ivanov regularization. We prove in this work that, even in this case, it is possible to exploit well-known Quadratic Programming solvers like, for example, Sequential Minimal Optimization, so improving the applicability of the in-sample approach.

[1]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  Chih-Jen Lin,et al.  Asymptotic convergence of an SMO algorithm without any assumptions , 2002, IEEE Trans. Neural Networks.

[4]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[5]  Davide Anguita,et al.  Maximal Discrepancy for Support Vector Machines , 2011, ESANN.

[6]  V. Ivanov,et al.  The Theory of Approximate Methods and Their Application to the Numerical Solution of Singular Integr , 1978 .

[7]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[8]  J. Weston,et al.  Support Vector Machine Solvers , 2007 .

[9]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[10]  Peter L. Bartlett,et al.  Model Selection and Error Estimation , 2000, Machine Learning.

[11]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[12]  L. Bottou,et al.  1 Support Vector Machine Solvers , 2007 .

[13]  S. Sathiya Keerthi,et al.  Evaluation of simple performance measures for tuning SVM hyperparameters , 2003, Neurocomputing.

[14]  L. Martein,et al.  On solving a linear program with one quadratic constraint , 1987 .

[15]  David Haussler,et al.  A Discriminative Framework for Detecting Remote Protein Homologies , 2000, J. Comput. Biol..

[16]  Isabelle Guyon,et al.  Model Selection: Beyond the Bayesian/Frequentist Divide , 2010, J. Mach. Learn. Res..

[17]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[18]  Robert Tibshirani,et al.  The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..

[19]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[20]  D. Anguita,et al.  K-fold generalization capability assessment for support vector classifiers , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[21]  A. Isaksson,et al.  Cross-validation and bootstrapping are unreliable in small sample classification , 2008, Pattern Recognit. Lett..

[22]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[24]  Sally Floyd,et al.  Sample compression, learnability, and the Vapnik-Chervonenkis dimension , 2004, Machine Learning.

[25]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[26]  Davide Anguita,et al.  Model selection for support vector machines: Advantages and disadvantages of the Machine Learning Theory , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).