A dimensionality reduction technique for unconstrained global optimization of functions with low effective dimensionality

We investigate the unconstrained global optimization of functions with low effective dimensionality, that are constant along certain (unknown) linear subspaces. Extending the technique of random subspace embeddings in [Wang et al., Bayesian optimization in a billion dimensions via random embeddings. JAIR, 55(1): 361--387, 2016], we study a generic Random Embeddings for Global Optimization (REGO) framework that is compatible with any global minimization algorithm. Instead of the original, potentially large-scale optimization problem, within REGO, a Gaussian random, low-dimensional problem with bound constraints is formulated and solved in a reduced space. We provide novel probabilistic bounds for the success of REGO in solving the original, low effective-dimensionality problem, which show its independence of the (potentially large) ambient dimension and its precise dependence on the dimensions of the effective and randomly embedding subspaces. These results significantly improve existing theoretical analyses by providing the exact distribution of a reduced minimizer and its Euclidean norm and by the general assumptions required on the problem. We validate our theoretical findings by extensive numerical testing of REGO with three types of global optimization solvers, illustrating the improved scalability of REGO compared to the full-dimensional application of the respective solvers.

[1]  Kirthevasan Kandasamy,et al.  High Dimensional Bayesian Optimisation and Bandits via Additive Models , 2015, ICML.

[2]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[3]  Matthias Poloczek,et al.  A Framework for Bayesian Optimization in Embedded Subspaces , 2019, ICML.

[4]  A. Zygmund,et al.  Measure and integral : an introduction to real analysis , 1977 .

[5]  Jan Vybíral,et al.  Learning Functions of Few Arbitrary Linear Parameters in High Dimensions , 2010, Found. Comput. Math..

[6]  Andreas Krause,et al.  Joint Optimization and Variable Selection of High-dimensional Gaussian Processes , 2012, ICML.

[7]  Volkan Cevher,et al.  High-Dimensional Bayesian Optimization via Additive Models with Overlapping Groups , 2018, AISTATS.

[8]  Zi Wang,et al.  Batched Large-scale Bayesian Optimization in High-dimensional Spaces , 2017, AISTATS.

[9]  V. Cevher,et al.  Learning Non-Parametric Basis Independent Models from Point Queries via Low-Rank Methods , 2013, 1310.1826.

[10]  Arjun K. Gupta,et al.  Lp-norm spherical distribution , 1997 .

[11]  Tamás Vinkó,et al.  A comparison of complete global optimization solvers , 2005, Math. Program..

[12]  Yang Yu,et al.  Derivative-Free Optimization of High-Dimensional Non-Convex Functions by Sequential Random Embeddings , 2016, IJCAI.

[13]  M. Rudelson,et al.  The smallest singular value of a random rectangular matrix , 2008, 0802.3956.

[14]  Kevin Leyton-Brown,et al.  An Efficient Approach for Assessing Hyperparameter Importance , 2014, ICML.

[15]  C. T. Kelley,et al.  A Locally-Biased form of the DIRECT Algorithm , 2001, J. Glob. Optim..

[16]  S. Szarek,et al.  Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[17]  Yang Yu,et al.  Solving High-Dimensional Multi-Objective Optimization Problems with Low Effective Dimensions , 2017, AAAI.

[18]  Edward Neuman,et al.  Inequalities and Bounds for the Incomplete Gamma Function , 2013 .

[19]  David Ginsbourger,et al.  On the choice of the low-dimensional domain for global optimization via random embeddings , 2017, Journal of Global Optimization.

[20]  Chris G. Knight,et al.  Association of parameter, software, and hardware variation with large-scale behavior across 57,000 climate models , 2007, Proceedings of the National Academy of Sciences.

[21]  Roman Vershynin,et al.  High-Dimensional Probability , 2018 .

[22]  Andrew Gordon Wilson,et al.  Scaling Gaussian Process Regression with Derivatives , 2018, NeurIPS.

[23]  Chun-Liang Li,et al.  High Dimensional Bayesian Optimization via Restricted Projection Pursuit Models , 2016, AISTATS.

[24]  Ernesto P. Adorio,et al.  MVF - Multivariate Test Functions Library in C for Unconstrained Global Optimization , 2005 .

[25]  Andreas Krause,et al.  High-Dimensional Gaussian Process Bandits , 2013, NIPS.

[26]  D. Finkel,et al.  Direct optimization algorithm user guide , 2003 .

[27]  A. Rukhin Matrix Variate Distributions , 1999, The Multivariate Normal Distribution.

[28]  H. Zimmermann Towards global optimization 2: L.C.W. DIXON and G.P. SZEGÖ (eds.) North-Holland, Amsterdam, 1978, viii + 364 pages, US $ 44.50, Dfl. 100,-. , 1979 .

[29]  A. Edelman Eigenvalues and condition numbers of random matrices , 1988 .

[30]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[31]  Roman Garnett,et al.  Active Learning of Linear Embeddings for Gaussian Processes , 2013, UAI.

[32]  Jorge Nocedal,et al.  Knitro: An Integrated Package for Nonlinear Optimization , 2006 .

[33]  Nikolaos V. Sahinidis,et al.  A polyhedral branch-and-cut approach to global optimization , 2005, Math. Program..

[34]  Nando de Freitas,et al.  Bayesian Optimization in a Billion Dimensions via Random Embeddings , 2013, J. Artif. Intell. Res..

[35]  David Ginsbourger,et al.  A Warped Kernel Improving Robustness in Bayesian Optimization Via Random Embeddings , 2014, LION.

[36]  L. Joseph,et al.  Bayesian Statistics: An Introduction , 1989 .

[37]  Malek Ben Salem,et al.  Sequential dimension reduction for learning features of expensive black-box functions , 2019 .

[38]  Ata Kabán,et al.  REMEDA: Random Embedding EDA for Optimising Functions with Intrinsic Dimension , 2016, PPSN.

[39]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .