A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption

We study the problem of optimizing a function under a \emph{budgeted number of evaluations}. We only assume that the function is \emph{locally} smooth around one of its global optima. The difficulty of optimization is measured in terms of 1) the amount of \emph{noise} $b$ of the function evaluation and 2) the local smoothness, $d$, of the function. A smaller $d$ results in smaller optimization error. We come with a new, simple, and parameter-free approach. First, for all values of $b$ and $d$, this approach recovers at least the state-of-the-art regret guarantees. Second, our approach additionally obtains these results while being \textit{agnostic} to the values of both $b$ and $d$. This leads to the first algorithm that naturally adapts to an \textit{unknown} range of noise $b$ and leads to significant improvements in a moderate and low-noise regime. Third, our approach also obtains a remarkable improvement over the state-of-the-art SOO algorithm when the noise is very low which includes the case of optimization under deterministic feedback ($b=0$). There, under our minimal local smoothness assumption, this improvement is of exponential magnitude and holds for a class of functions that covers the vast majority of functions that practitioners optimize ($d=0$). We show that our algorithmic improvement is borne out in experiments as we empirically show faster convergence on common benchmarks.

[1]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[2]  J D Pinter,et al.  Global Optimization in Action—Continuous and Lipschitz Optimization: Algorithms, Implementations and Applications , 2010 .

[3]  David M. W. Powers,et al.  Applications and Explanations of Zipf’s Law , 1998, CoNLL.

[4]  Yaroslav D. Sergeyev,et al.  Global one-dimensional optimization using smooth auxiliary functions , 1998, Math. Program..

[5]  Roman G. Strongin,et al.  Global optimization with non-convex constraints , 2000 .

[6]  Y. D. Sergeyev,et al.  Global Optimization with Non-Convex Constraints - Sequential and Parallel Algorithms (Nonconvex Optimization and its Applications Volume 45) (Nonconvex Optimization and Its Applications) , 2000 .

[7]  Yaroslav D. Sergeyev,et al.  Global Search Based on Efficient Diagonal Partitions and a Set of Lipschitz Constants , 2006, SIAM J. Optim..

[8]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[9]  Rémi Munos,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[10]  G. William Walster,et al.  Global Optimization Using Interval Analysis: Revised and Expanded , 2007 .

[11]  Peter Auer,et al.  Improved Rates for the Stochastic Continuum-Armed Bandit Problem , 2007, COLT.

[12]  Eli Upfal,et al.  Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[13]  A. Hoorfar,et al.  INEQUALITIES ON THE LAMBERTW FUNCTION AND HYPERPOWER FUNCTION , 2008 .

[14]  Rémi Munos,et al.  Optimistic Planning of Deterministic Systems , 2008, EWRL.

[15]  Massimiliano Pontil,et al.  Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[16]  Yaroslav D. Sergeyev,et al.  An information global minimization algorithm using the local improvement technique , 2010, J. Glob. Optim..

[17]  Dominik D. Freydenberger,et al.  Can We Learn to Gamble Efficiently? , 2010, COLT.

[18]  Aleksandrs Slivkins,et al.  Multi-armed bandits on implicit metric spaces , 2011, NIPS.

[19]  Rémi Munos,et al.  Optimistic Optimization of Deterministic Functions , 2011, NIPS 2011.

[20]  Jia Yuan Yu,et al.  Lipschitz Bandits without the Lipschitz Constant , 2011, ALT.

[21]  Aleksandrs Slivkins,et al.  25th Annual Conference on Learning Theory The Best of Both Worlds: Stochastic and Adversarial Bandits , 2022 .

[22]  Alexander J. Smola,et al.  Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations , 2012, ICML.

[23]  Yaroslav D. Sergeyev,et al.  Lipschitz gradients for global optimization in a one-point-based partitioning scheme , 2012, J. Comput. Appl. Math..

[24]  Robert Babuska,et al.  Optimistic planning for continuous-action deterministic systems , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[25]  Rémi Munos,et al.  Stochastic Simultaneous Optimistic Optimization , 2013, ICML.

[26]  Roman G. Strongin,et al.  Introduction to Global Optimization Exploiting Space-Filling Curves , 2013 .

[27]  Nando de Freitas,et al.  Bayesian Multi-Scale Optimistic Optimization , 2014, AISTATS.

[28]  Rémi Munos,et al.  From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning , 2014, Found. Trends Mach. Learn..

[29]  Alessandro Lazaric,et al.  Online Stochastic Optimization under Correlated Bandit Feedback , 2014, ICML.

[30]  Philippe Preux,et al.  Bandits attack function optimization , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[31]  Wouter M. Koolen,et al.  Follow the leader if you can, hedge if you must , 2013, J. Mach. Learn. Res..

[32]  Aleksandrs Slivkins,et al.  One Practical Algorithm for Both Stochastic and Adversarial Bandits , 2014, ICML.

[33]  Lucian Busoniu,et al.  Consensus for black-box nonlinear agents using optimistic optimization , 2014, Autom..

[34]  Yaroslav D. Sergeyev,et al.  Deterministic approaches for solving practical black-box global optimization problems , 2015, Adv. Eng. Softw..

[35]  Bilel Derbel,et al.  Simultaneous optimistic optimization on the noiseless BBOB testbed , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[36]  Avraham Adler,et al.  Lambert-W Function , 2015 .

[37]  Yaroslav D. Sergeyev,et al.  Deterministic global optimization using space-filling curves and multiple estimates of Lipschitz and Holder constants , 2015, Commun. Nonlinear Sci. Numer. Simul..

[38]  Rémi Munos,et al.  Black-box optimization of noisy functions with unknown smoothness , 2015, NIPS.

[39]  Leslie Pack Kaelbling,et al.  Bayesian Optimization with Exponential Convergence , 2015, NIPS.

[40]  Yang Yu,et al.  Scaling Simultaneous Optimistic Optimization for High-Dimensional Non-Convex Functions with Low Effective Dimensions , 2016, AAAI.

[41]  Yu Maruyama,et al.  Global Continuous Optimization with Error Bound and Fast Convergence , 2016, J. Artif. Intell. Res..

[42]  Peter A. Norreys,et al.  Infinite dimensional optimistic optimisation with applications on physical systems , 2016, 1611.05845.

[43]  Nicolas Vayatis,et al.  Global optimization of Lipschitz functions , 2017, ICML.

[44]  János D. Pintér,et al.  Globally optimized packings of non-uniform size spheres in $$\mathbb {R}^{d}$$Rd: a computational study , 2018, Optim. Lett..

[45]  Alexandra Carpentier,et al.  Adaptivity to Smoothness in X-armed bandits , 2018, COLT.

[46]  Peter L. Bartlett,et al.  Best of both worlds: Stochastic & adversarial best-arm identification , 2018, COLT.

[47]  Sundaram Suresh,et al.  Multi-Objective Simultaneous Optimistic Optimization , 2016, Inf. Sci..

[48]  János D. Pintér,et al.  How difficult is nonlinear optimization? A practical solver tuning approach, with illustrative results , 2018, Ann. Oper. Res..

[49]  J. W. Gillard,et al.  Deterministic global optimization: an introduction to the diagonal approach , 2018, Optim. Methods Softw..

[50]  Michal Valko,et al.  General parallel optimization a without metric , 2019, ALT.