Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA

WEKA is a widely used, open-source machine learning platform. Due to its intuitive interface, it is particularly popular with novice users. However, such users often find it hard to identify the best approach for their particular dataset among the many available. We describe the new version of Auto-WEKA, a system designed to help such users by automatically searching through the joint space of WEKA's learning algorithms and their respective hyperparameter settings to maximize performance, using a state-of-the-art Bayesian optimization method. Our new package is tightly integrated with WEKA, making it just as accessible to end users as any other learning algorithm.

[1]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[2]  Joaquin Vanschoren,et al.  Selecting Classification Algorithms with Active Testing , 2012, MLDM.

[3]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[4]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[5]  Alain Biem,et al.  A model selection criterion for classification: application to HMM topology optimization , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[6]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[7]  Mohamed Cheriet,et al.  Model selection for the LS-SVM. Application to handwriting recognition , 2009, Pattern Recognit..

[8]  A. McQuarrie,et al.  Regression and Time Series Model Selection , 1998 .

[9]  Yoshua Bengio,et al.  Model Selection for Small Sample Regression , 2002, Machine Learning.

[10]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Michèle Sebag,et al.  Collaborative hyperparameter tuning , 2013, ICML.

[13]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[14]  Yoshua Bengio,et al.  Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[15]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[16]  Mauro Birattari,et al.  The irace Package: Iterated Race for Automatic Algorithm , 2011 .

[17]  Vadim V. Strijov,et al.  Nonlinear regression model generation using hyperparameter optimization , 2010, Comput. Math. Appl..

[18]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[19]  Donald R. Jones,et al.  Global versus local search in constrained optimization of computer models , 1998 .

[20]  Hilan Bensusan,et al.  Meta-Learning by Landmarking Various Learning Algorithms , 2000, ICML.

[21]  X. C. Guo,et al.  A novel LS-SVMs hyper-parameter selection based on particle swarm optimization , 2008, Neurocomputing.

[22]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[23]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[24]  Michael A. Osborne,et al.  Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces , 2014, 1409.4011.

[25]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[26]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[27]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[28]  Michael A. Osborne,et al.  A Kernel for Hierarchical Parameter Spaces , 2013, ArXiv.

[29]  Andrew W. Moore,et al.  Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.

[30]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[31]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[32]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[33]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[34]  Katharina Eggensperger,et al.  Towards an Empirical Foundation for Assessing Bayesian Optimization of Hyperparameters , 2013 .