Extending Expected Improvement for High-dimensional Stochastic Optimization of Expensive Black-Box Functions

Design optimization under uncertainty is notoriously difficult when the objective function is expensive to evaluate. State-of-the-art techniques, e.g, stochastic optimization or sampling average approximation, fail to learn exploitable patterns from collected data and require an excessive number of objective function evaluations. There is a need for techniques that alleviate the high cost of information acquisition and select sequential simulations optimally. In the field of deterministic single-objective unconstrained global optimization, the Bayesian global optimization (BGO) approach has been relatively successful in addressing the information acquisition problem. BGO builds a probabilistic surrogate of the expensive objective function and uses it to define an information acquisition function (IAF) whose role is to quantify the merit of making new objective evaluations. Specifically, BGO iterates between making the observations with the largest expected IAF and rebuilding the probabilistic surrogate, until a convergence criterion is met. In this work, we extend the expected improvement (EI) IAF to the case of design optimization under uncertainty wherein the EI policy is reformulated to filter out parametric and measurement uncertainties. To increase the robustness of our approach in the low sample regime, we employ a fully Bayesian interpretation of Gaussian processes by constructing a particle approximation of the posterior of its hyperparameters using adaptive Markov chain Monte Carlo. We verify and validate our approach by solving two synthetic optimization problems under uncertainty and demonstrate it by solving the oil-well-placement problem with uncertainties in the permeability field and the oil price time series.

[1]  Kyung K. Choi,et al.  Modified Bayesian Kriging for Noisy Response Problems for Reliability Analysis , 2015, DAC 2015.

[2]  Warren B. Powell,et al.  The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..

[3]  Jacob Dearmon,et al.  Gaussian Process Regression and Bayesian Model Averaging: An Alternative Approach to Modeling Spatial Phenomena , 2016 .

[4]  Ilias Bilionis,et al.  Multi-output separable Gaussian process: Towards an efficient, fully Bayesian paradigm for uncertainty quantification , 2013, J. Comput. Phys..

[5]  Warren B. Powell,et al.  The Correlated Knowledge Gradient for Simulation Optimization of Continuous Parameters using Gaussian Process Regression , 2011, SIAM J. Optim..

[6]  Warren B. Powell,et al.  A Knowledge-Gradient Policy for Sequential Information Collection , 2008, SIAM J. Control. Optim..

[7]  Warren B. Powell,et al.  The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery , 2011, INFORMS J. Comput..

[8]  Alexander Shapiro,et al.  The Sample Average Approximation Method for Stochastic Discrete Optimization , 2002, SIAM J. Optim..

[9]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[10]  R. A. Miller,et al.  Sequential kriging optimization using multiple-fidelity evaluations , 2006 .

[11]  E. Jaynes Probability theory : the logic of science , 2003 .

[12]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[13]  Julien Bect,et al.  Robust Gaussian Process-Based Global Optimization Using a Fully Bayesian Expected Improvement Criterion , 2011, LION.

[14]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[15]  Daniel W. Apley,et al.  Objective-Oriented Sequential Sampling for Simulation Based Robust Design Considering Multiple Sources of Uncertainty , 2013 .

[16]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[17]  Peng Chen,et al.  Uncertainty propagation using infinite mixture of Gaussian processes and variational Bayesian inference , 2015, J. Comput. Phys..

[18]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[19]  N. Zheng,et al.  Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..

[20]  A. O'Hagan,et al.  Bayesian emulation of complex multi-output and dynamic computer models , 2010 .

[21]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[22]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[23]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[24]  Enrique del Castillo,et al.  A Bayesian Approach to Sequential Optimization based on Computer Experiments , 2015, Qual. Reliab. Eng. Int..

[25]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[26]  Eric B. Baum,et al.  Neural net algorithms that learn in polynomial time from examples and queries , 1991, IEEE Trans. Neural Networks.

[27]  Michael James Sasena,et al.  Flexibility and efficiency enhancements for constrained global design optimization with kriging approximations. , 2002 .

[28]  T. Neumann Probability Theory The Logic Of Science , 2016 .

[29]  Adam D. Bull,et al.  Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[30]  N. Zabaras,et al.  Solution of inverse problems with limited forward solver evaluations: a Bayesian perspective , 2013 .

[31]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[32]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[33]  Michael Andrew Christie,et al.  Tenth SPE Comparative Solution Project: a comparison of upscaling techniques , 2001 .

[34]  R. Ghanem,et al.  Stochastic Finite Elements: A Spectral Approach , 1990 .

[35]  Aimo A. Törn,et al.  Global Optimization , 1999, Science.

[36]  Marco Locatelli,et al.  Bayesian Algorithms for One-Dimensional Global Optimization , 1997, J. Glob. Optim..

[37]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[38]  N. Cressie The origins of kriging , 1990 .

[39]  Ilias Bilionis,et al.  Multi-output local Gaussian process regression: Applications to uncertainty quantification , 2012, J. Comput. Phys..

[40]  Eric Walter,et al.  An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..

[41]  Jonas Mockus,et al.  Application of Bayesian approach to numerical methods of global and stochastic optimization , 1994, J. Glob. Optim..

[42]  Ilias Bilionis,et al.  Multidimensional Adaptive Relevance Vector Machines for Uncertainty Quantification , 2012, SIAM J. Sci. Comput..

[43]  Heikki Haario,et al.  DRAM: Efficient adaptive MCMC , 2006, Stat. Comput..