论文信息 - Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees - 字舞流文

Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

Bayesian optimization is a sequential decision making framework for optimizing expensive-to-evaluate black-box functions. Computing a full lookahead policy amounts to solving a highly intractable stochastic dynamic program. Myopic approaches, such as expected improvement, are often adopted in practice, but they ignore the long-term impact of the immediate decision. Existing nonmyopic approaches are mostly heuristic and/or computationally expensive. In this paper, we provide the first efficient implementation of general multi-step lookahead Bayesian optimization, formulated as a sequence of nested optimization problems within a multi-step scenario tree. Instead of solving these problems in a nested way, we equivalently optimize all decision variables in the full tree jointly, in a ``one-shot'' fashion. Combining this with an efficient method for implementing multi-step Gaussian process ``fantasization,'' we demonstrate that multi-step expected improvement is computationally tractable and exhibits performance superior to existing methods on a wide range of benchmarks.

Roman Garnett | Brian Karrer | Maximilian Balandat | Shali Jiang | Daniel R. Jiang | Jacob R. Gardner | R. Garnett | B. Karrer | Maximilian Balandat | Shali Jiang | J. Gardner

[1] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[2] José Miguel Hernández-Lobato,et al. Constrained Bayesian optimization for automatic chemical design using variational autoencoders. , 2019 .

[3] Zi Wang,et al. Max-value Entropy Search for Efficient Bayesian Optimization , 2017, ICML.

[4] Jian Wu,et al. Practical Two-Step Lookahead Bayesian Optimization , 2019, NeurIPS.

[5] Roman Garnett,et al. BINOCULARS for efficient, nonmyopic sequential experimental design , 2019, ICML.

[6] Frank Hutter,et al. Maximizing acquisition functions for Bayesian optimization , 2018, NeurIPS.

[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8] Karen Willcox,et al. Bayesian Optimization with a Finite Budget: An Approximate Dynamic Programming Approach , 2016, NIPS.

[9] Kirthevasan Kandasamy,et al. Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[10] D. Ginsbourger,et al. Towards Gaussian Process-based Optimization with Finite Time Horizon , 2010 .

[11] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[12] Peter I. Frazier,et al. A Tutorial on Bayesian Optimization , 2018, ArXiv.

[13] Peter I. Frazier,et al. The Parallel Knowledge Gradient Method for Batch Bayesian Optimization , 2016, NIPS.

[14] Roman Garnett,et al. Efficient nonmyopic batch active search , 2018, NeurIPS.

[15] Roman Garnett,et al. Efficient Nonmyopic Active Search , 2017, ICML.

[16] Wei Chen,et al. Bayesian Optimization for Materials Design with Mixed Quantitative and Qualitative Variables , 2019, Scientific Reports.

[17] Neil D. Lawrence,et al. GLASSES: Relieving The Myopia Of Bayesian Optimisation , 2015, AISTATS.

[18] Michael A. Osborne,et al. Gaussian Processes for Global Optimization , 2008 .

[19] Andrew Gordon Wilson,et al. Constant-Time Predictive Distributions for Gaussian Processes , 2018, ICML.

[20] Xubo Yue,et al. Why Non-myopic Bayesian Optimization is Promising and How Far Should We Look-ahead? A Study via Rollout , 2019, AISTATS.

[21] Roman Garnett,et al. Automating Bayesian optimization with Bayesian optimization , 2018, NeurIPS.

[22] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .

[23] Andrew Gordon Wilson,et al. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization , 2020, NeurIPS.

[24] Kevin Leyton-Brown,et al. Efficient benchmarking of algorithm configurators via model-based surrogates , 2017, Machine Learning.

[25] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[26] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[27] Andrew Gordon Wilson,et al. GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , 2018, NeurIPS.

[28] Kevin Leyton-Brown,et al. Efficient Benchmarking of Hyperparameter Optimizers via Surrogates , 2015, AAAI.

[29] Andrew Gordon Wilson,et al. BoTorch: Programmable Bayesian Optimization in PyTorch , 2019, ArXiv.

[30] Jonas Mockus,et al. On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[31] Peter I. Frazier,et al. Parallel Bayesian Global Optimization of Expensive Functions , 2016, Oper. Res..

[32] Andreas Krause,et al. Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[33] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[34] Roman Garnett,et al. D-VAE: A Variational Autoencoder for Directed Acyclic Graphs , 2019, NeurIPS.

[35] D. Ginsbourger,et al. Kriging is well-suited to parallelize optimization , 2010 .