论文信息 - Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection - 字舞流文

Computationally Efficient High-Dimensional Bayesian Optimization via Variable Selection

Bayesian Optimization (BO) is a method for globally optimizing black-box functions. While BO has been successfully applied to many scenarios, developing effective BO algorithms that scale to functions with high-dimensional domains is still a challenge. Optimizing such functions by vanilla BO is extremely timeconsuming. Alternative strategies for high-dimensional BO that are based on the idea of embedding the high-dimensional space to the one with low dimension are sensitive to the choice of the embedding dimension, which needs to be pre-specified. We develop a new computationally efficient high-dimensional BO method that exploits variable selection. Our method is able to automatically learn axis-aligned sub-spaces, i.e. spaces containing selected variables, without the demand of any pre-specified hyperparameters. We theoretically analyze the computational complexity of our algorithm and derive the regret bound. We empirically show the efficacy of our method on several synthetic and real problems.

Carl Kingsford | Yihang Shen | Yi Shen | Carl Kingsford

[1] Rodolphe Le Riche,et al. Bayesian optimization in effective dimensions via kernel-based sensitivity indices , 2019 .

[2] Zi Wang,et al. Batched High-dimensional Bayesian Optimization via Structural Kernel Learning , 2017, ICML.

[3] H. Keselman,et al. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables , 1992 .

[4] Aaron Klein,et al. Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets , 2016, AISTATS.

[5] Jonas Mockus,et al. On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[6] Eytan Bakshy,et al. Re-Examining Linear Embeddings for High-Dimensional Bayesian Optimization , 2020, NeurIPS.

[7] Kirthevasan Kandasamy,et al. Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly , 2019, J. Mach. Learn. Res..

[8] Peter I. Frazier,et al. A Tutorial on Bayesian Optimization , 2018, ArXiv.

[9] Kirthevasan Kandasamy,et al. High Dimensional Bayesian Optimisation and Bandits via Additive Models , 2015, ICML.

[10] Andreas Krause,et al. High-Dimensional Gaussian Process Bandits , 2013, NIPS.

[11] Warren B. Powell,et al. The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery , 2011, INFORMS J. Comput..

[12] David D. Cox,et al. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[13] Aki Vehtari,et al. Variable selection for Gaussian processes via sensitivity analysis of the posterior predictive distribution , 2017, AISTATS.

[14] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[15] Jan Peters,et al. Bayesian optimization for learning gaits under uncertainty , 2015, Annals of Mathematics and Artificial Intelligence.

[16] Riccardo Moriconi,et al. High-dimensional Bayesian optimization using low-dimensional feature spaces , 2019, Machine Learning.

[17] Kevin Leyton-Brown,et al. Automated Configuration of Mixed Integer Programming Solvers , 2010, CPAIOR.

[18] Jan Hendrik Metzen,et al. Minimum Regret Search for Single- and Multi-Task Optimization , 2016, ICML.

[19] Jos'e Miguel Hern'andez-Lobato,et al. Constrained Bayesian Optimization for Automatic Chemical Design , 2017 .

[20] Andreas Krause,et al. Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[21] G. Stewart. The Efficient Generation of Random Orthogonal Matrices with an Application to Condition Estimators , 1980 .

[22] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[23] Peter Richtárik,et al. Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods , 2017, Computational Optimization and Applications.

[24] Isabelle Guyon,et al. Taking Human out of Learning Applications: A Survey on Automated Machine Learning , 2018, 1810.13306.

[25] Andreas Krause,et al. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.

[26] Kevin Leyton-Brown,et al. An Efficient Approach for Assessing Hyperparameter Importance , 2014, ICML.

[27] Nikolaus Hansen,et al. The CMA Evolution Strategy: A Tutorial , 2016, ArXiv.

[28] Martin Jankowiak,et al. High-Dimensional Bayesian Optimization with Sparse Axis-Aligned Subspaces , 2021, UAI.

[29] Chun-Liang Li,et al. High Dimensional Bayesian Optimization via Restricted Projection Pursuit Models , 2016, AISTATS.

[30] Neil D. Lawrence,et al. Bayesian Optimization for Synthetic Gene Design , 2015, 1505.01627.

[31] Steven Reece,et al. Automated Machine Learning on Big Data using Stochastic Algorithm Tuning , 2014 .

[32] Volkan Cevher,et al. High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups , 2018, AISTATS.

[33] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[34] Matthias Poloczek,et al. A Framework for Bayesian Optimization in Embedded Subspaces , 2019, ICML.

[35] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.