Practical bayesian optimization

Global optimization of non-convex functions over real vector spaces is a problem of widespread theoretical and practical interest. In the past fifty years, research in global optimization has produced many important approaches including Lipschitz optimization, simulated annealing, homotopy methods, genetic algorithms, and Bayesian response-surface methods. This work examines the last of these approaches. The Bayesian response-surface approach to global optimization maintains a posterior model of the function being optimized by combining a prior over functions with accumulating function evaluations. The model is then used to compute which point the method should acquire next in its search for the optimum of the function. Bayesian methods can be some of the most efficient approaches to optimization in terms of the number of function evaluations required, but they have significant drawbacks: Current approaches are needlessly data-inefficient, approximations to the Bayes-optimal acquisition criterion are poorly studied, and current approaches do not take advantage of the small-scale properties of differentiable functions near local optima. This work addresses each of these problems to make Bayesian methods more widely applicable.