Using Gradient Based Multikernel Gaussian Process and Meta-Acquisition Function to Accelerate SMBO

Automatic machine learning (automl) is a crucial technology in machine learning. Sequential model-based optimisation algorithms (SMBO) (e.g., SMAC, TPE) are state-of-the-art hyperparameter optimisation methods in automl. However, SMBO does not consider known information, like the best hyperparameters high possibility range and gradients. In this paper, we accelerate the traditional SMBO method and name our method as accSMBO. In accSMBO, we build a gradient-based multikernel Gaussian process with a good generalisation ability and we design meta-acquisition function which encourages that SMBO puts more attention on the best hyperparameters high possibility range. In L2 norm regularised logistic loss function experiments, our method exhibited state-of-the-art performance.

[1]  Fabian Pedregosa,et al.  Hyperparameter optimization with approximate gradient , 2016, ICML.

[2]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[3]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[4]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[5]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[6]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[7]  Jie Yu,et al.  A Bayesian model averaging based multi-kernel Gaussian process regression framework for nonlinear state estimation and quality prediction of multiphase batch processes with transient dynamics and uncertainty , 2013 .

[8]  Chuan-Sheng Foo,et al.  Efficient multiple hyperparameter learning for log-linear models , 2007, NIPS.

[9]  Yang Yuan,et al.  Hyperparameter Optimization: A Spectral Approach , 2017, ICLR.

[10]  Matthias Poloczek,et al.  Bayesian Optimization with Gradients , 2017, NIPS.

[11]  Ricardo Vilalta,et al.  Metalearning - Applications to Data Mining , 2008, Cognitive Technologies.

[12]  Carl E. Rasmussen,et al.  Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[13]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[14]  Andreas Krause,et al.  Gaussian Process Bandits without Regret: An Experimental Design Approach , 2009, ArXiv.

[15]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[16]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[17]  Christine A. Shoemaker,et al.  Constrained Global Optimization of Expensive Black Box Functions Using Radial Basis Functions , 2005, J. Glob. Optim..

[18]  Joaquin Vanschoren,et al.  Meta-Learning: A Survey , 2018, Automated Machine Learning.