Meta-learning Recommendation of Default Hyper-parameter Values for SVMs in Classification Tasks

Machine learning algorithms have been investigated in several scenarios, one of them is the data classification. The predictive performance of the models induced by these algorithms is usually strongly affected by the values used for their hyper-parameters. Different approaches to define these values have been proposed, like the use of default values and optimization techniques. Although default values can result in models with good predictive performance, different implementations of the same machine learning algorithms use different default values, leading to models with clearly different predictive performance for the same dataset. Optimization techniques have been used to search for hyper-parameter values able to maximize the predictive performance of induced models for a given dataset, but with the drawback of a high computational cost. A compromise is to use an optimization technique to search for values that are suitable for a wide spectrum of datasets. This paper investigates the use of meta-learning to recommend default values for the induction of Support Vector Machine models for a new classification dataset. We compare the default values suggested by the Weka and LibSVM tools with default values optimized by meta-heuristics on a large range of datasets. This study covers only classification task, but we believe that similar ideas could be used in other related tasks. According to the experimental results, meta-models can accurately predict whether tool suggested or optimized default values should be used.

[1]  James Kennedy Particle Swarms: Optimization Based on Sociocognition , 2005 .

[2]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Combining meta-learning and search techniques to select parameters for support vector machines , 2012, Neurocomputing.

[3]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Luís Torgo,et al.  OpenML: networked science in machine learning , 2014, SKDD.

[6]  Andreas Dengel,et al.  Automatic classifier selection for non-experts , 2012, Pattern Analysis and Applications.

[7]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[8]  Andreas Dengel,et al.  Meta-learning for evolutionary parameter optimization of classifiers , 2012, Machine Learning.

[9]  Sigrún Andradóttir,et al.  A Review of Random Search Methods , 2015 .

[10]  Michèle Sebag,et al.  Collaborative hyperparameter tuning , 2013, ICML.

[11]  A. E. Eiben,et al.  Efficient relevance estimation and value calibration of evolutionary algorithm parameters , 2007, 2007 IEEE Congress on Evolutionary Computation.

[12]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[13]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[14]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Bio-inspired Optimization Techniques for SVM Parameter Tuning , 2008, 2008 10th Brazilian Symposium on Neural Networks.

[15]  Bernd Bischl,et al.  Tuning and evolution of support vector kernels , 2012, Evol. Intell..

[16]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[17]  M. C. Monard,et al.  A Note on Parameter Selection for Support Vector Machines , 2013, MICAI.

[18]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[19]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[20]  Bernd Bischl,et al.  To tune or not to tune: Recommending when to adjust SVM hyper-parameters via meta-learning , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[21]  B. Lang,et al.  Efficient optimization of support vector machine learning parameters for unbalanced datasets , 2006 .

[22]  Carlos Soares,et al.  A Meta-Learning Method to Select the Kernel Width in Support Vector Regression , 2004, Machine Learning.

[23]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Noisy Data Set Identification , 2013, HAIS.

[24]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[25]  Thomas Stützle,et al.  Automatic Algorithm Configuration Based on Local Search , 2007, AAAI.

[26]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.