Multi-Information Source Optimization

We consider Bayesian optimization of an expensive-to-evaluate black-box objective function, where we also have access to cheaper approximations of the objective. In general, such approximations arise in applications such as reinforcement learning, engineering, and the natural sciences, and are subject to an inherent, unknown bias. This model discrepancy is caused by an inadequate internal model that deviates from reality and can vary over the domain, making the utilization of these approximations a non-trivial task. We present a novel algorithm that provides a rigorous mathematical treatment of the uncertainties arising from model discrepancies and noisy observations. Its optimization decisions rely on a value of information analysis that extends the Knowledge Gradient factor to the setting of multiple information sources that vary in cost: each sampling decision maximizes the predicted benefit per unit cost. We conduct an experimental evaluation that demonstrates that the method consistently outperforms other state-of-the-art techniques: it finds designs of considerably higher objective value and additionally inflicts less cost in the exploration process.

[1]  R. Olea Geostatistics for Natural Resources Evaluation By Pierre Goovaerts, Oxford University Press, Applied Geostatistics Series, 1997, 483 p., hardcover, $65 (U.S.), ISBN 0-19-511538-4 , 1999 .

[2]  Warren B. Powell,et al.  The Correlated Knowledge Gradient for Simulation Optimization of Continuous Parameters using Gaussian Process Regression , 2011, SIAM J. Optim..

[3]  Warren B. Powell,et al.  A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..

[4]  Zoubin Ghahramani,et al.  Parallel Predictive Entropy Search for Batch Global Optimization of Expensive Objective Functions , 2015, NIPS.

[5]  Michael S. Eldred,et al.  Second-Order Corrections for Surrogate-Based Optimization with Model Hierarchies , 2004 .

[6]  Kirthevasan Kandasamy,et al.  Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations , 2016, NIPS.

[7]  Jasper Snoek,et al.  Multi-Task Bayesian Optimization , 2013, NIPS.

[8]  Matthias Poloczek,et al.  Warm starting Bayesian optimization , 2016, 2016 Winter Simulation Conference (WSC).

[9]  N. Zheng,et al.  Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..

[10]  Loic Le Gratiet,et al.  RECURSIVE CO-KRIGING MODEL FOR DESIGN OF COMPUTER EXPERIMENTS WITH MULTIPLE LEVELS OF FIDELITY , 2012, 1210.0686.

[11]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[12]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[13]  Barry L. Nelson,et al.  Discrete Optimization via Simulation Using COMPASS , 2006, Oper. Res..

[14]  Victor Picheny,et al.  Quantile-Based Optimization of Noisy Computer Experiments With Tunable Precision , 2013, Technometrics.

[15]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[16]  Karen Willcox,et al.  Provably Convergent Multifidelity Optimization Algorithm Not Requiring High-Fidelity Derivatives , 2012 .

[17]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[18]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[19]  Karen Willcox,et al.  Multifidelity Optimization using Statistical Surrogate Modeling for Non-Hierarchical Information Sources , 2015 .

[20]  Matthias Poloczek,et al.  Bayesian Optimization with Gradients , 2017, NIPS.

[21]  Jenný Brynjarsdóttir,et al.  Learning about physical parameters: the importance of model discrepancy , 2014 .

[22]  R. L. Winkler Combining Probability Distributions from Dependent Information Sources , 1981 .

[23]  Aaron Klein,et al.  Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets , 2016, AISTATS.

[24]  Raphael T. Haftka,et al.  Surrogate-based Analysis and Optimization , 2005 .

[25]  Warren B. Powell,et al.  The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..

[26]  A. O'Hagan,et al.  Predicting the output from a complex computer code when fast approximations are available , 2000 .

[27]  Loïc Le Gratiet,et al.  Cokriging-Based Sequential Design Strategies Using Fast Cross-Validation Techniques for Multi-Fidelity Computer Codes , 2015, Technometrics.

[28]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[29]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[30]  Eric Walter,et al.  An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..

[31]  Ilan Kroo,et al.  A Multifidelity Gradient-Free Optimization Method and Application to Aerodynamic Design , 2008 .

[32]  S. Ghosal,et al.  Posterior consistency of Gaussian process prior for nonparametric binary regression , 2006, math/0702686.

[33]  Huashuai Qu,et al.  Sequential Selection with Unknown Correlation Structures , 2015, Oper. Res..

[34]  Peter I. Frazier,et al.  Stratified Bayesian Optimization , 2016, ArXiv.

[35]  Alexander I. J. Forrester,et al.  Multi-fidelity optimization via surrogate modelling , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[36]  Karen Willcox,et al.  A MATHEMATICAL AND COMPUTATIONAL FRAMEWORK FOR MULTIFIDELITY DESIGN AND ANALYSIS WITH COMPUTER MODELS , 2014 .

[37]  Yee Whye Teh,et al.  Semiparametric latent factor models , 2005, AISTATS.

[38]  Kirthevasan Kandasamy,et al.  Multi-fidelity Gaussian Process Bandit Optimisation , 2016, J. Artif. Intell. Res..

[39]  Paul R. Milgrom,et al.  Envelope Theorems for Arbitrary Choice Sets , 2002 .

[40]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[41]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[42]  R. A. Miller,et al.  Sequential kriging optimization using multiple-fidelity evaluations , 2006 .

[43]  Vladimir Balabanov,et al.  Multi-Fidelity Optimization with High-Fidelity Analysis and Low-Fidelity Gradients , 2004 .