Dynamic Control of Explore/Exploit Trade-Off In Bayesian Optimization

Bayesian optimization offers the possibility of optimizing black-box operations not accessible through traditional techniques. The success of Bayesian optimization methods such as Expected Improvement (EI) are significantly affected by the degree of trade-off between exploration and exploitation. Too much exploration can lead to inefficient optimization protocols, whilst too much exploitation leaves the protocol open to strong initial biases, and a high chance of getting stuck in a local minimum. Typically, a constant margin is used to control this trade-off, which results in yet another hyper-parameter to be optimized. We propose contextual improvement as a simple, yet effective heuristic to counter this - achieving a one-shot optimization strategy. Our proposed heuristic can be swiftly calculated and improves both the speed and robustness of discovery of optimal solutions. We demonstrate its effectiveness on both synthetic and real world problems and explore the unaccounted for uncertainty in the pre-determination of search hyperparameters controlling explore-exploit trade-off.

[1]  William David Lubitz,et al.  Optimal design and operation of Archimedes screw turbines using Bayesian optimization , 2016 .

[2]  Alexis Boukouvalas,et al.  GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..

[3]  E. Vázquez,et al.  Convergence properties of the expected improvement algorithm with fixed mean and covariance functions , 2007, 0712.3744.

[4]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[5]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[6]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[7]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[8]  D. Lizotte Practical bayesian optimization , 2008 .

[9]  Ryan P. Adams,et al.  Slice sampling covariance hyperparameters of latent Gaussian models , 2010, NIPS.

[10]  Jasper Snoek,et al.  Bayesian Optimization with Unknown Constraints , 2014, UAI.

[11]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[12]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[13]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[14]  Donald R. Jones,et al.  Direct Global Optimization Algorithm , 2009, Encyclopedia of Optimization.

[15]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  J. Mockus,et al.  The Bayesian approach to global optimization , 1989 .

[19]  Chris Eliasmith,et al.  Hyperopt: a Python library for model selection and hyperparameter optimization , 2015 .

[20]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[21]  S. Walker Invited comment on the paper "Slice Sampling" by Radford Neal , 2003 .