论文信息 - Bayesian Optimization with Binary Auxiliary Information

Bayesian Optimization with Binary Auxiliary Information

This paper presents novel mixed-type Bayesian optimization (BO) algorithms to accelerate the optimization of a target objective function by exploiting correlated auxiliary information of binary type that can be more cheaply obtained, such as in policy search for reinforcement learning and hyperparameter tuning of machine learning models with early stopping. To achieve this, we first propose a mixed-type multi-output Gaussian process (MOGP) to jointly model the continuous target function and binary auxiliary functions. Then, we propose information-based acquisition functions such as mixed-type entropy search (MT-ES) and mixed-type predictive ES (MT-PES) for mixed-type BO based on the MOGP predictive belief of the target and auxiliary functions. The exact acquisition functions of MT-ES and MT-PES cannot be computed in closed form and need to be approximated. We derive an efficient approximation of MT-PES via a novel mixed-type random features approximation of the MOGP model whose cross-correlation structure between the target and auxiliary functions can be exploited for improving the belief of the global target maximizer using observations from evaluating these functions. We propose new practical constraints to relate the global target maximizer to the binary auxiliary functions. We empirically evaluate the performance of MT-ES and MT-PES with synthetic and real-world experiments.

Kian Hsiang Low | Zhongxiang Dai | Yehong Zhang

[1] Eric Walter,et al. An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..

[2] Yee Whye Teh,et al. Semiparametric latent factor models , 2005, AISTATS.

[3] Neil D. Lawrence,et al. Preferential Bayesian Optimization , 2017, ICML.

[4] H. Wackernagle,et al. Multivariate geostatistics: an introduction with applications , 1998 .

[5] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[6] Carl E. Rasmussen,et al. Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[7] Mike Rees,et al. 5. Statistics for Spatial Data , 1993 .

[8] Matthew W. Hoffman,et al. Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[9] R. A. Miller,et al. Sequential kriging optimization using multiple-fidelity evaluations , 2006 .

[10] Benjamin Van Roy,et al. A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..

[11] Kirthevasan Kandasamy,et al. Multi-Fidelity Black-Box Optimization with Hierarchical Partitions , 2018, ICML.