Bayesian Optimization Based on K-Optimality

Bayesian optimization (BO) based on the Gaussian process (GP) surrogate model has attracted extensive attention in the field of optimization and design of experiments (DoE). It usually faces two problems: the unstable GP prediction due to the ill-conditioned Gram matrix of the kernel and the difficulty of determining the trade-off parameter between exploitation and exploration. To solve these problems, we investigate the K-optimality, aiming at minimizing the condition number. Firstly, the Sequentially Bayesian K-optimal design (SBKO) is proposed to ensure the stability of the GP prediction, where the K-optimality is given as the acquisition function. We show that the SBKO reduces the integrated posterior variance and maximizes the hyper-parameters’ information gain simultaneously. Secondly, a K-optimal enhanced Bayesian Optimization (KO-BO) approach is given for the optimization problems, where the K-optimality is used to define the trade-off balance parameters which can be output automatically. Specifically, we focus our study on the K-optimal enhanced Expected Improvement algorithm (KO-EI). Numerical examples show that the SBKO generally outperforms the Monte Carlo, Latin hypercube sampling, and sequential DoE approaches by maximizing the posterior variance with the highest precision of prediction. Furthermore, the study of the optimization problem shows that the KO-EI method beats the classical EI method due to its higher convergence rate and smaller variance.

[1]  Shifeng Xiong,et al.  Sequential Design and Analysis of High-Accuracy and Low-Accuracy Computer Codes , 2013, Technometrics.

[2]  Michael P. Shea,et al.  Computer simulation of liquid crystals , 2005 .

[3]  Alex H. Barbat,et al.  Monte Carlo techniques in computational stochastic mechanics , 1998 .

[4]  Ryan P. Adams,et al.  Slice sampling covariance hyperparameters of latent Gaussian models , 2010, NIPS.

[5]  Nicolas Gayton,et al.  AK-MCS: An active learning reliability method combining Kriging and Monte Carlo Simulation , 2011 .

[6]  Xiaojun Chen,et al.  Minimizing the Condition Number of a Gram Matrix , 2011, SIAM J. Optim..

[7]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[8]  Malek Ben Salem,et al.  Universal Prediction Distribution for Surrogate Models , 2015, SIAM/ASA J. Uncertain. Quantification.

[9]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[10]  Kurt Binder,et al.  Introduction: Theory and “Technical” Aspects of Monte Carlo Simulations , 1986 .

[11]  Mark A. Pitt,et al.  A Hierarchical Adaptive Approach to Optimal Experimental Design , 2014, Neural Computation.

[12]  Jane J. Ye,et al.  Optimizing Condition Numbers , 2009, SIAM J. Optim..

[13]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[14]  Daniel Busby,et al.  Hierarchical adaptive experimental design for Gaussian process emulators , 2009, Reliab. Eng. Syst. Saf..

[15]  Claudia Baier,et al.  Principles Of Optimal Design Modeling And Computation , 2016 .

[16]  D. Ginsbourger,et al.  A benchmark of kriging-based infill criteria for noisy optimization , 2013, Structural and Multidisciplinary Optimization.

[17]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[18]  R. Caflisch Monte Carlo and quasi-Monte Carlo methods , 1998, Acta Numerica.

[19]  S. Baran K-optimal designs for parameters of shifted Ornstein–Uhlenbeck processes and sheets , 2016, 1604.05489.

[20]  Udo von Toussaint,et al.  Sequential Batch Design for Gaussian Processes Employing Marginalization † , 2017, Entropy.

[21]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[22]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[23]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[24]  Jane J. Ye,et al.  Minimizing the Condition Number to Construct Design Points for Polynomial Regression Models , 2013, SIAM J. Optim..

[25]  Harvey Gould,et al.  An introduction to computer simulation methods , 1988 .

[26]  Jon C. Helton,et al.  Latin Hypercube Sampling and the Propagation of Uncertainty in Analyses of Complex Systems , 2002 .