Cost Effective Optimization for Cost-related Hyperparameters

The increasing demand for democratizing machine learning algorithms for general software developers calls for hyperparameter optimization (HPO) solutions at low cost. Many machine learning algorithms have hyperparameters, which can cause a large variation in the training cost. But this effect is largely ignored in existing HPO methods, which are incapable to properly control cost during the optimization process. To address this problem, we develop a cost effective HPO solution. The core of our solution is a new randomized direct-search method. We prove a convergence rate of $O(\frac{\sqrt{d}}{\sqrt{K}})$ and provide an analysis on how it can be used to control evaluation cost under reasonable assumptions. Extensive evaluation using a latest AutoML benchmark shows a strong any time performance of the proposed HPO method when tuning cost-related hyperparameters.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Michael I. Jordan,et al.  How to Escape Saddle Points Efficiently , 2017, ICML.

[3]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[4]  Yurii Nesterov,et al.  Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[5]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[6]  Bernd Bischl,et al.  An Open Source AutoML Benchmark , 2019, ArXiv.

[7]  Adam D. Bull,et al.  Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[8]  Tamara G. Kolda,et al.  Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods , 2003, SIAM Rev..

[9]  Fei Sha,et al.  Hyper-parameter Tuning under a Budget Constraint , 2019, IJCAI.

[10]  Kirthevasan Kandasamy,et al.  Multi-fidelity Bayesian Optimisation with Continuous Approximations , 2017, ICML.

[11]  Aaron Klein,et al.  Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets , 2016, AISTATS.

[12]  Randal S. Olson,et al.  PMLB: a large benchmark suite for machine learning evaluation and comparison , 2017, BioData Mining.

[13]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[14]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[15]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[16]  Mingyi Hong,et al.  signSGD via Zeroth-Order Oracle , 2019, ICLR.

[17]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.