Fast Calculation of the Knowledge Gradient for Optimization of Deterministic Engineering Simulations

A novel efficient method for computing the Knowledge-Gradient policy for Continuous Parameters (KGCP) for deterministic optimization is derived. The differences with Expected Improvement (EI), a popular choice for Bayesian optimization of deterministic engineering simulations, are explored. Both policies and the Upper Confidence Bound (UCB) policy are compared on a number of benchmark functions including a problem from structural dynamics. It is empirically shown that KGCP has similar performance as the EI policy for many problems, but has better convergence properties for complex (multi-modal) optimization problems as it emphasizes more on exploration when the model is confident about the shape of optimal regions. In addition, the relationship between Maximum Likelihood Estimation (MLE) and slice sampling for estimation of the hyperparameters of the underlying models, and the complexity of the problem at hand, is studied.

[1]  Chang Chieh Hang,et al.  The min-max function differentiation and training of fuzzy neural networks , 1996, IEEE Trans. Neural Networks.

[2]  Tom Dhaene,et al.  Fast calculation of multiobjective probability of improvement and expected improvement criteria for Pareto optimization , 2014, J. Glob. Optim..

[3]  N. Zheng,et al.  Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..

[4]  D. Ginsbourger,et al.  A benchmark of kriging-based infill criteria for noisy optimization , 2013, Structural and Multidisciplinary Optimization.

[5]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[6]  Warren B. Powell,et al.  The Knowledge-Gradient Policy for Correlated Normal Beliefs , 2009, INFORMS J. Comput..

[7]  P. Gill,et al.  Conjugate-Gradient Methods for Large-Scale Nonlinear Optimization. , 1979 .

[8]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[9]  Michael T. M. Emmerich,et al.  Hypervolume-based expected improvement: Monotonicity properties and exact computation , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[10]  Jack P. C. Kleijnen,et al.  The correct Kriging variance estimated by bootstrapping , 2006, J. Oper. Res. Soc..

[11]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[12]  Dirk Gorissen,et al.  Grid-enabled adaptive surrogate modeling for computer aided engineering , 2010 .

[13]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[14]  Joshua D. Knowles,et al.  ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems , 2006, IEEE Transactions on Evolutionary Computation.

[15]  Daniel Hern'andez-Lobato,et al.  Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints , 2016, Neurocomputing.

[16]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[17]  Warren B. Powell,et al.  Optimal Learning: Powell/Optimal , 2012 .

[18]  Alexander J. Smola,et al.  Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations , 2012, ICML.

[19]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[20]  Andy J. Keane,et al.  Engineering Design via Surrogate Modelling - A Practical Guide , 2008 .

[21]  Rafael Martí,et al.  Experimental Testing of Advanced Scatter Search Designs for Global Optimization of Multimodal Functions , 2005, J. Glob. Optim..

[22]  Hendrik Rogier,et al.  Surrogate-based infill optimization applied to electromagnetic problems , 2010 .

[23]  Donald R. Jones,et al.  Global optimization of deceptive functions with sparse sampling , 2008 .

[24]  Xin-She Yang,et al.  A literature survey of benchmark functions for global optimisation problems , 2013, Int. J. Math. Model. Numer. Optimisation.

[25]  G. Venter,et al.  An algorithm for fast optimal Latin hypercube design of experiments , 2010 .

[26]  Ingrid Moerman,et al.  Efficient global optimization of multi-parameter network problems on wireless testbeds , 2015, Ad Hoc Networks.

[27]  T. Dhaene,et al.  Efficient optimization of the integrity behavior of analog nonlinear devices using surrogate models , 2013, 2013 17th IEEE Workshop on Signal and Power Integrity.

[28]  Matt J. Kusner,et al.  Bayesian Optimization with Inequality Constraints , 2014, ICML.

[29]  Warren B. Powell,et al.  The Correlated Knowledge Gradient for Simulation Optimization of Continuous Parameters using Gaussian Process Regression , 2011, SIAM J. Optim..

[30]  Tom Dhaene,et al.  Adaptive classification under computational budget constraints using sequential data gathering , 2016, Adv. Eng. Softw..

[31]  W. Marsden I and J , 2012 .

[32]  Ryan P. Adams,et al.  Slice sampling covariance hyperparameters of latent Gaussian models , 2010, NIPS.

[33]  Tom Dhaene,et al.  Sensitivity of night cooling performance to room/system design: Surrogate models based on CFD , 2012 .

[34]  Piet Demeester,et al.  ooDACE toolbox: a flexible object-oriented Kriging implementation , 2014, J. Mach. Learn. Res..

[35]  Adam D. Bull,et al.  Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..

[36]  D. Dennis,et al.  SDO : A Statistical Method for Global Optimization , 1997 .

[37]  Piet Demeester,et al.  A Surrogate Modeling and Adaptive Sampling Toolbox for Computer Based Design , 2010, J. Mach. Learn. Res..