Distributed Batch Gaussian Process Optimization

This paper presents a novel distributed batch Gaussian process upper confidence bound (DB-GP-UCB) algorithm for performing batch Bayesian optimization (BO) of highly complex, costly-to-evaluate black-box objective functions. In contrast to existing batch BO algorithms, DBGP-UCB can jointly optimize a batch of inputs (as opposed to selecting the inputs of a batch one at a time) while still preserving scalability in the batch size. To realize this, we generalize GP-UCB to a new batch variant amenable to a Markov approximation, which can then be naturally formulated as a multi-agent distributed constraint optimization problem in order to fully exploit the efficiency of its state-of-the-art solvers for achieving linear time in the batch size. Our DB-GP-UCB algorithm offers practitioners the flexibility to trade off between the approximation quality and time efficiency by varying the Markov order. We provide a theoretical guarantee for the convergence rate of DB-GP-UCB via bounds on its cumulative regret. Empirical evaluation on synthetic benchmark objective functions and a real-world optimization problem shows that DB-GP-UCB outperforms the stateof-the-art batch BO algorithms.

[1]  Neil D. Lawrence,et al.  Batch Bayesian Optimization via Local Penalization , 2015, AISTATS.

[2]  Kian Hsiang Low,et al.  Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond , 2015, AAAI.

[3]  R. Lark,et al.  Geostatistics for Environmental Scientists , 2001 .

[4]  Kian Hsiang Low,et al.  A Generalized Stochastic Variational Bayesian Hyperparameter Learning Framework for Sparse Spectrum Gaussian Process Regression , 2016, AAAI.

[5]  Zoubin Ghahramani,et al.  Parallel Predictive Entropy Search for Batch Global Optimization of Expensive Objective Functions , 2015, NIPS.

[6]  Gaurav S. Sukhatme,et al.  Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena , 2012, UAI.

[7]  Vianney Perchet,et al.  Gaussian Process Optimization with Mutual Information , 2013, ICML.

[8]  Eric Walter,et al.  An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..

[9]  D. Lizotte Practical bayesian optimization , 2008 .

[10]  Kian Hsiang Low,et al.  Adaptive multi-robot wide-area exploration and mapping , 2008, AAMAS.

[11]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[12]  Kian Hsiang Low,et al.  Generalized Online Sparse Gaussian Processes with Application to Persistent Mobile Robot Localization , 2014, ECML/PKDD.

[13]  R. Reese Geostatistics for Environmental Scientists , 2001 .

[14]  David Ginsbourger,et al.  Fast Computation of the Multi-Points Expected Improvement with Applications in Batch Selection , 2013, LION.

[15]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[16]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[17]  Andreas Krause,et al.  Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[18]  Kian Hsiang Low,et al.  Recent Advances in Scaling Up Gaussian Process Predictive Models for Large Spatiotemporal Data , 2014, DyDESS.

[19]  Kian Hsiang Low,et al.  Multi-robot active sensing of non-stationary gaussian process-based environmental phenomena , 2014, AAMAS.

[20]  Kian Hsiang Low,et al.  Gaussian Process Decentralized Data Fusion and Active Sensing for Spatiotemporal Traffic Modeling and Prediction in Mobility-on-Demand Systems , 2015, IEEE Transactions on Automation Science and Engineering.

[21]  Nicolas Vayatis,et al.  Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration , 2013, ECML/PKDD.

[22]  M. Opper Sparse Online Gaussian Processes , 2008 .

[23]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[24]  Alan Fern,et al.  Batch Bayesian Optimization via Simulation Matching , 2010, NIPS.

[25]  Andrew W. Moore,et al.  A Nonparametric Approach to Noisy and Costly Optimization , 2000, ICML.

[26]  Kian Hsiang Low,et al.  Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations , 2013, UAI.

[27]  Pushmeet Kohli,et al.  Batched Gaussian Process Bandit Optimization via Determinantal Point Processes , 2016, NIPS.

[28]  José M. F. Moura,et al.  Block matrices with L-block-banded inverse: inversion algorithms , 2005, IEEE Transactions on Signal Processing.

[29]  Kian Hsiang Low,et al.  Decentralized active robotic exploration and mapping for probabilistic field classification in environmental sensing , 2012, AAMAS.

[30]  Kian Hsiang Low,et al.  Multi-robot informative path planning for active sensing of environmental phenomena: a tale of two algorithms , 2013, AAMAS.

[31]  Fabrício Enembreck,et al.  Distributed Constraint Optimization Problems: Review and perspectives , 2014, Expert Syst. Appl..

[32]  Kian Hsiang Low,et al.  A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models , 2016, ICML.

[33]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[34]  Kian Hsiang Low,et al.  Active Markov information-theoretic path planning for robotic environmental sensing , 2011, AAMAS.

[35]  Kian Hsiang Low,et al.  GP-Localize: Persistent Mobile Robot Localization using Online Sparse Gaussian Process Observation Model , 2014, AAAI.

[36]  Mohan S. Kankanhalli,et al.  Nonmyopic \(\epsilon\)-Bayes-Optimal Active Learning of Gaussian Processes , 2014, ICML.

[37]  Kian Hsiang Low,et al.  A Unifying Framework of Anytime Sparse Gaussian Process Regression Models with Stochastic Variational Inference for Big Data , 2015, ICML.

[38]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[39]  Peter I. Frazier,et al.  The Parallel Knowledge Gradient Method for Batch Bayesian Optimization , 2016, NIPS.

[40]  Nicholas R. Jennings,et al.  Decentralised coordination of low-power embedded devices using the max-sum algorithm , 2008, AAMAS.

[41]  Archie C. Chapman,et al.  A unifying framework for iterative approximate best-response algorithms for distributed constraint optimization problems1 , 2011, The Knowledge Engineering Review.

[42]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[43]  Kian Hsiang Low,et al.  Parallel Gaussian Process Regression for Big Data: Low-Rank Representation Meets Markov Approximation , 2014, AAAI.

[44]  Kian Hsiang Low,et al.  Gaussian Process-Based Decentralized Data Fusion and Active Sensing for Mobility-on-Demand System , 2013, Robotics: Science and Systems.

[45]  Nicholas R. Jennings,et al.  Bounded approximate decentralised coordination via the max-sum algorithm , 2009, Artif. Intell..

[46]  Kian Hsiang Low,et al.  Information-Theoretic Approach to Efficient Adaptive Path Planning for Mobile Robotic Environmental Sensing , 2009, ICAPS.

[47]  Henry P. Wynn,et al.  Maximum entropy sampling , 1987 .