论文信息 - Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations

Last-Iterate Convergence of Saddle Point Optimizers via High-Resolution Differential Equations

Several widely-used ﬁrst-order saddle-point optimization methods yield an identical continuous-time ordinary diﬀerential equation (ODE) that is identical to that of Gradient Descent Ascent (GDA) method when derived naively. However, the convergence properties of these methods are qualitatively diﬀerent even on simple bilinear games. Thus the ODE perspective, which has proved powerful in analyzing single-objective optimization methods, has not played a similar role in saddle-point optimization. We adopt a framework studied in ﬂuid dynamics—known as High-Resolution Diﬀerential Equations (HRDEs)— to design diﬀerential equation models for several saddle-point optimization methods. Critically, these HRDEs are distinct for various saddle-point optimization methods. Moreover, on bilinear games, the convergence properties of the HRDEs match the qualitative features of the corresponding discrete methods. Additionally, we show that the HRDE of Optimistic Gradient Descent Ascent (OGDA) exhibits last-iterate convergence for general monotone variational inequalities. Finally, we provide rates of convergence for the best-iterate convergence of the OGDA method, relying solely on the ﬁrst-order smoothness of the monotone operator.

Michael I. Jordan | Tatjana Chavdarova | M. Zampetakis

[1] Haihao Lu,et al. An O(sr)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(s^r)$$\end{document}-resolution ODE framework for understand , 2020, Mathematical Programming.

[2] Michael I. Jordan,et al. Efficient Methods for Structured Nonconvex-Nonconcave Min-Max Optimization , 2020, AISTATS.

[3] Laurent Lessard,et al. A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints , 2020, J. Mach. Learn. Res..

[4] Ya-Ping Hsieh,et al. The limits of min-max optimization algorithms: convergence to spurious non-critical sets , 2020, ICML.

[5] Jacob Abernethy,et al. Last-iterate convergence rates for min-max optimization , 2019, ArXiv.

[6] Michael I. Jordan,et al. Understanding the acceleration phenomenon via high-resolution differential equations , 2018, Mathematical Programming.

[7] Lillian J. Ratliff,et al. Local Convergence Analysis of Gradient Descent Ascent with Finite Timescale Separation , 2021, ICLR.

[8] Noah Golowich,et al. Tight last-iterate convergence rates for no-regret learning in multi-player games , 2020, NeurIPS.

[9] Ioannis Mitliagkas,et al. LEAD: Least-Action Dynamics for Min-Max Optimization , 2020, ArXiv.

[10] Ioannis Mitliagkas,et al. Stochastic Hamiltonian Gradient Methods for Smooth Games , 2020, ICML.

[11] Michael I. Jordan,et al. On dissipative symplectic integration with applications to gradient-based optimization , 2020, Journal of Statistical Mechanics: Theory and Experiment.