论文信息 - Adaptive first-order methods revisited: Convex optimization without Lipschitz requirements - 字舞流文

Adaptive first-order methods revisited: Convex optimization without Lipschitz requirements

We propose a new family of adaptive first-order methods for a class of convex minimization problems that may fail to be Lipschitz continuous or smooth in the standard sense. Specifically, motivated by a recent flurry of activity on nonLipschitz (NoLips) optimization, we consider problems that are continuous or smooth relative to a reference Bregman function – as opposed to a global, ambient norm (Euclidean or otherwise). These conditions encompass a wide range of problems with singular objective, such as Fisher markets, Poisson tomography, D-design, and the like. In this setting, the application of existing order-optimal adaptive methods – like UnixGrad or AcceleGrad – is not possible, especially in the presence of randomness and uncertainty. The proposed method, adaptive mirror descent (AdaMir), aims to close this gap by concurrently achieving min-max optimal rates in problems that are relatively continuous or smooth, including stochastic ones.

Kimon Antonakopoulos | Panayotis Mertikopoulos | P. Mertikopoulos | Kimon Antonakopoulos

[1] Lin Xiao,et al. Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[2] Matthew J. Streeter,et al. Adaptive Bound Optimization for Online Convex Optimization , 2010, COLT 2010.

[3] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[4] Volkan Cevher,et al. Online Adaptive Methods, Universality and Acceleration , 2018, NeurIPS.

[5] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[6] Tim Roughgarden,et al. Algorithmic Game Theory , 2007 .

[7] Yurii Nesterov,et al. Relatively Smooth Convex Optimization by First-Order Methods, and Applications , 2016, SIAM J. Optim..

[8] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[9] Peter Richtarik,et al. Accelerated Bregman proximal gradient methods for relatively smooth convex optimization , 2018, Computational Optimization and Applications.

[10] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[11] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..

[12] Stephen P. Boyd,et al. On the Convergence of Mirror Descent beyond Stochastic Convex Programming , 2017, SIAM J. Optim..

[13] Alexander Gasnikov,et al. Inexact model: a framework for optimization and variational inequalities , 2019, Optim. Methods Softw..

[14] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .

[15] V. I. Shmyrev,et al. An algorithm for finding equilibrium in the linear exchange model with fixed budgets , 2009 .

[16] Francis Bach,et al. A Universal Algorithm for Variational Inequalities Adaptive to Smoothness and Noise , 2019, COLT.

[17] Kimon Antonakopoulos,et al. Adaptive extra-gradient methods for min-max optimization and games , 2020, ICLR.

[18] Marc Teboulle,et al. Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..

[19] L. Eon Bottou. Online Learning and Stochastic Approximations , 1998 .

[20] Gilles Stoltz,et al. A second-order bound with excess losses , 2014, COLT.

[21] Frank Kelly,et al. Rate control for communication networks: shadow prices, proportional fairness and stability , 1998, J. Oper. Res. Soc..

[22] Volkan Cevher,et al. UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization , 2019, NeurIPS.

[23] Marc Teboulle,et al. A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications , 2017, Math. Oper. Res..

[24] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[25] Fang Wu,et al. Proportional response dynamics leads to market equilibrium , 2007, STOC '07.

[26] Marc Teboulle,et al. A simplified view of first order methods for optimization , 2018, Math. Program..

[27] Kimon Antonakopoulos,et al. Online and stochastic optimization beyond Lipschitz continuity: A Riemannian approach , 2020, ICLR.

[28] 丸山徹. Convex Analysisの二,三の進展について , 1977 .

[29] Alexandre d'Aspremont,et al. Optimal Complexity and Certification of Bregman First-Order Methods , 2021, Mathematical Programming.

[30] Marc Teboulle,et al. Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions , 1993, SIAM J. Optim..

[31] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[32] Mark W. Schmidt,et al. Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses , 2020, NeurIPS.

[33] A. Juditsky,et al. Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[34] Haihao Lu. “Relative Continuity” for Non-Lipschitz Nonsmooth Convex Optimization Using Stochastic (or Deterministic) Mirror Descent , 2017, INFORMS Journal on Optimization.

[35] Marc Teboulle,et al. First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems , 2017, SIAM J. Optim..

[36] Yurii Nesterov,et al. Primal-dual subgradient methods for convex problems , 2005, Math. Program..

[37] Xiaoxia Wu,et al. L ] 1 0 A pr 2 01 9 AdaGrad-Norm convergence over nonconvex landscapes AdaGrad stepsizes : sharp convergence over nonconvex landscapes , from any initialization , 2019 .

[38] Francesco Orabona,et al. On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes , 2018, AISTATS.

[39] Y. Censor,et al. An iterative row-action method for interval convex programming , 1981 .

[40] P. L. Combettes,et al. Quasi-Fejérian Analysis of Some Optimization Algorithms , 2001 .

[41] Yurii Nesterov,et al. Universal gradient methods for convex optimization problems , 2015, Math. Program..

[42] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[43] Nikhil R. Devanur,et al. Distributed algorithms via gradient descent for fisher markets , 2011, EC '11.

[44] M. Bertero,et al. Image deblurring with Poisson data: from cells to galaxies , 2009 .

[45] Y. Nesterov. A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .