ADMMBO: Bayesian Optimization with Unknown Constraints using ADMM

There exist many problems in science and engineering that involve optimization of an unknown or partially unknown objective function. Recently, Bayesian Optimization (BO) has emerged as a powerful tool for solving optimization problems whose objective functions are only available as a black box and are expensive to evaluate. Many practical problems, however, involve optimization of an unknown objective function subject to unknown constraints. This is an important yet challenging problem for which, unlike optimizing an unknown function, existing methods face several limitations. In this paper, we present a novel constrained Bayesian optimization framework to optimize an unknown objective function subject to unknown constraints. We introduce an equivalent optimization by augmenting the objective function with constraints, introducing auxiliary variables for each constraint, and forcing the new variables to be equal to the main variable. Building on the Alternating Direction Method of Multipliers (ADMM) algorithm, we propose ADMM-Bayesian Optimization (ADMMBO) to solve the problem in an iterative fashion. Our framework leads to multiple unconstrained subproblems with unknown objective functions, which we then solve via BO. Our method resolves several challenges of state-of-the-art techniques: it can start from infeasible points, is insensitive to initialization, can efficiently handle 'decoupled problems' and has a concrete stopping criterion. Extensive experiments on a number of challenging BO benchmark problems show that our proposed approach outperforms the state-of-the-art methods in terms of the speed of obtaining a feasible solution and convergence to the global optimum as well as minimizing the number of total evaluations of unknown objective and constraints functions.

[1]  Matthew W. Hoffman,et al.  A General Framework for Constrained Bayesian Optimization using Information-based Search , 2015, J. Mach. Learn. Res..

[2]  Victor Picheny,et al.  Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian , 2016, NIPS.

[3]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Bayesian Optimization with Unknown Constraints , 2015, ICML.

[4]  Nando de Freitas,et al.  A Bayesian interactive optimization approach to procedural animation design , 2010, SCA '10.

[5]  Zoubin Ghahramani,et al.  Collaborative Gaussian Processes for Preference Learning , 2012, NIPS.

[6]  Jasper Snoek,et al.  Bayesian Optimization with Unknown Constraints , 2014, UAI.

[7]  Ali Jalali,et al.  Hybrid Batch Bayesian Optimization , 2012, ICML.

[8]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[9]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[10]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[11]  Wotao Yin,et al.  Global Convergence of ADMM in Nonconvex Nonsmooth Optimization , 2015, Journal of Scientific Computing.

[12]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[13]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[14]  P Baldi,et al.  Enhanced Higgs boson to τ(+)τ(-) search with deep learning. , 2014, Physical review letters.

[15]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[16]  Nando de Freitas,et al.  Active Policy Learning for Robot Planning and Exploration under Uncertainty , 2007, Robotics: Science and Systems.

[17]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[18]  Robert B. Gramacy,et al.  Optimization Under Unknown Constraints , 2010, 1004.4027.

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[21]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[22]  Victor Picheny,et al.  Quantile-Based Optimization of Noisy Computer Experiments With Tunable Precision , 2013, Technometrics.

[23]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[24]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[25]  Aaron Klein,et al.  Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets , 2016, AISTATS.

[26]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[27]  Zhi-Quan Luo,et al.  Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems , 2015, ICASSP.

[28]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.

[29]  Donald R. Jones,et al.  Global versus local search in constrained optimization of computer models , 1998 .

[30]  Jasper Snoek,et al.  Bayesian Optimization and Semiparametric Models with Applications to Assistive Technology , 2014 .

[31]  Steven L. Scott,et al.  A modern Bayesian look at the multi-armed bandit , 2010 .

[32]  Sébastien Le Digabel,et al.  Modeling an Augmented Lagrangian for Blackbox Constrained Optimization , 2014, Technometrics.

[33]  Victor Picheny,et al.  A Stepwise uncertainty reduction approach to constrained global optimization , 2014, AISTATS.

[34]  D. Dennis,et al.  A statistical method for global optimization , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.

[35]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[36]  João M. F. Xavier,et al.  D-ADMM: A Communication-Efficient Distributed Algorithm for Separable Optimization , 2012, IEEE Transactions on Signal Processing.

[37]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[38]  Michael A. Gelbart,et al.  Constrained Bayesian Optimization and Applications , 2015 .

[39]  Jasper Snoek,et al.  Multi-Task Bayesian Optimization , 2013, NIPS.

[40]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[41]  D. Finkel,et al.  Direct optimization algorithm user guide , 2003 .

[42]  D. Lizotte Practical bayesian optimization , 2008 .

[43]  Nando de Freitas,et al.  On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning , 2014, AISTATS.

[44]  Matt J. Kusner,et al.  Bayesian Optimization with Inequality Constraints , 2014, ICML.

[45]  Zheng Xu,et al.  An Empirical Study of ADMM for Nonconvex Problems , 2016, ArXiv.

[46]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[47]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[48]  Matthias Poloczek,et al.  Bayesian Optimization with Gradients , 2017, NIPS.

[49]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .