论文信息 - On the Impossibility of Global Convergence in Multi-Loss Optimization - 字舞流文

On the Impossibility of Global Convergence in Multi-Loss Optimization

Under mild regularity conditions, gradient-based methods converge globally to a critical point in the single-loss setting. This is known to break down for vanilla gradient descent when moving to multi-loss optimization, but can we hope to build some algorithm with global guarantees? We negatively resolve this open problem by proving that any reasonable algorithm will exhibit limit cycles or diverge to infinite losses in some differentiable game, even in two-player games with zero-sum interactions. A reasonable algorithm is simply one which avoids strict maxima, an exceedingly weak assumption since converging to maxima would be the opposite of minimization. This impossibility theorem holds even if we impose existence of a strict minimum and no other critical points. The proof is constructive, enabling us to display explicit limit cycles for existing gradient-based methods. Nonetheless, it remains an open question whether cycles arise in high-dimensional games of interest to ML practitioners, such as GANs or multi-agent RL.

Alistair Letcher | Alistair Letcher

[1] Hans Schönemann,et al. SINGULAR: a computer algebra system for polynomial computations , 2001, ACCA.

[2] M. Spivak. Calculus On Manifolds: A Modern Approach To Classical Theorems Of Advanced Calculus , 2019 .

[3] Shimon Whiteson,et al. Stable Opponent Shaping in Differentiable Games , 2018, ICLR.

[4] Michael I. Jordan,et al. Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games , 2019, AAMAS.

[5] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6] Sebastian Nowozin,et al. The Numerics of GANs , 2017, NIPS.

[7] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[8] Georgios Piliouras,et al. Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos , 2017, NIPS.

[9] Georgios Piliouras,et al. Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent , 2019, COLT.

[10] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[11] R. Steele,et al. Optimization , 2005, Encyclopedia of Biometrics.

[12] Georgios Piliouras,et al. Game dynamics as the meaning of a game , 2019, SECO.

[13] G. Casella,et al. Springer Texts in Statistics , 2016 .

[14] L. F. Abbott,et al. Hierarchical Control Using Networks Trained with Higher-Level Forward Models , 2014, Neural Computation.

[15] Volkan Cevher,et al. The limits of min-max optimization algorithms: convergence to spurious non-critical sets , 2020, ArXiv.

[16] Florian Schäfer,et al. Competitive Gradient Descent , 2019, NeurIPS.

[17] Georgios Piliouras,et al. Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games , 2019, NeurIPS.

[18] Constantinos Daskalakis,et al. The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[19] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.

[20] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21] Jacob Abernethy,et al. Last-iterate convergence rates for min-max optimization , 2019, ALT.

[22] S. Basu,et al. Algorithms in real algebraic geometry , 2003 .

[23] S. Shankar Sastry,et al. On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[24] Georgios Piliouras,et al. Gradient Descent Only Converges to Minimizers: Non-Isolated Critical Points and Invariant Regions , 2016, ITCS.

[25] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.

[26] Michael I. Jordan,et al. First-order methods almost always avoid saddle points: The case of vanishing step-sizes , 2019, NeurIPS.

[27] Constantinos Daskalakis,et al. Training GANs with Optimism , 2017, ICLR.

[28] Phillipp Kaestner,et al. Linear And Nonlinear Programming , 2016 .

[29] Joel Z. Leibo,et al. Smooth markets: A basic mechanism for organizing gradient-based learners , 2020, ICLR.

[30] Thore Graepel,et al. Differentiable Game Mechanics , 2019, J. Mach. Learn. Res..

[31] Ioannis Mitliagkas,et al. A Tight and Unified Analysis of Extragradient for a Whole Spectrum of Differentiable Games , 2019, ArXiv.

[32] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[33] M. Shub. Global Stability of Dynamical Systems , 1986 .

[34] Robert E. Mahony,et al. Convergence of the Iterates of Descent Methods for Analytic Cost Functions , 2005, SIAM J. Optim..

[35] Niao He,et al. Global Convergence and Variance-Reduced Optimization for a Class of Nonconvex-Nonconcave Minimax Problems , 2020, ArXiv.

[36] Nisheeth K. Vishnoi,et al. A Second-order Equilibrium in Nonconvex-Nonconcave Min-max Optimization: Existence and Algorithm , 2020, ArXiv.

[37] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.

[38] Tamer Basar,et al. Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games , 2019, NeurIPS.