Scalable Combinatorial Bayesian Optimization with Tractable Statistical models

We study the problem of optimizing expensive blackbox functions over combinatorial spaces (e.g., sets, sequences, trees, and graphs). BOCS (Baptista and Poloczek, 2018) is a state-of-the-art Bayesian optimization method for tractable statistical models, which performs semi-definite programming based acquisition function optimization (AFO) to select the next structure for evaluation. Unfortunately, BOCS scales poorly for large number of binary and/or categorical variables. Based on recent advances in submodular relaxation (Ito and Fujimaki, 2016) for solving Binary Quadratic Programs, we study an approach referred as Parametrized Submodular Relaxation (PSR) towards the goal of improving the scalability and accuracy of solving AFO problems for BOCS model. PSR approach relies on two key ideas. First, reformulation of AFO problem as submodular relaxation with some unknown parameters, which can be solved efficiently using minimum graph cut algorithms. Second, construction of an optimization problem to estimate the unknown parameters with close approximation to the true objective. Experiments on diverse benchmark problems show significant improvements with PSR for BOCS model. The source code is available at this https URL .

[1]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[2]  Peter I. Frazier,et al.  Bayesian optimization for materials design , 2015, 1506.01349.

[3]  Yisong Yue,et al.  A General Framework for Multi-fidelity Bayesian Optimization with Gaussian Processes , 2018, AISTATS.

[4]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Janardhan Rao Doppa,et al.  Uncertainty-Aware Search Framework for Multi-Objective Bayesian Optimization , 2020, AAAI.

[6]  Janardhan Rao Doppa,et al.  Max-value Entropy Search for Multi-Objective Bayesian Optimization , 2020, NeurIPS.

[7]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[8]  Anton van den Hengel,et al.  Semidefinite Programming , 2014, Computer Vision, A Reference Guide.

[9]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Peter I. Frazier,et al.  A Tutorial on Bayesian Optimization , 2018, ArXiv.

[11]  Zi Wang,et al.  Max-value Entropy Search for Efficient Bayesian Optimization , 2017, ICML.

[12]  Alan Fern,et al.  Optimizing Discrete Spaces via Expensive Evaluations: A Learning to Search Framework , 2020, AAAI.

[13]  Daniel Freedman,et al.  Energy minimization via graph cuts: settling what is possible , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Moses Charikar,et al.  Maximizing quadratic programs: extending Grothendieck's inequality , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[15]  Shinji Ito,et al.  Large-Scale Price Optimization via Network Flow , 2016, NIPS.

[16]  Stephan Mertens,et al.  Low autocorrelation binary sequences , 2015, 1512.02475.

[17]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[18]  Daniel Hern'andez-Lobato,et al.  Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints , 2016, Neurocomputing.

[19]  Jian-Qiang Hu,et al.  Contamination control in food supply chain , 2010, Proceedings of the 2010 Winter Simulation Conference.

[20]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[21]  Matthias Poloczek,et al.  Bayesian Optimization of Combinatorial Structures , 2018, ICML.

[22]  Huan Li,et al.  Accelerated Proximal Gradient Methods for Nonconvex Programming , 2015, NIPS.

[23]  Ismail Ben Ayed,et al.  Pseudo-bound Optimization for Binary Energies , 2014, ECCV.

[24]  James G. Scott,et al.  The horseshoe estimator for sparse signals , 2010 .

[25]  Benjamin Van Roy,et al.  A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..

[26]  Lena Gorelick,et al.  Submodularization for Binary Pairwise Energies , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jakub M. Tomczak,et al.  Combinatorial Bayesian Optimization using the Graph Cartesian Product , 2019, NeurIPS.