论文信息 - Linear Lower Bounds and Conditioning of Differentiable Games - 字舞流文

Linear Lower Bounds and Conditioning of Differentiable Games

Recent successes of game-theoretic formulations in ML have caused a resurgence of research interest in differentiable games. Overwhelmingly, that research focuses on methods and upper bounds on their speed of convergence. In this work, we approach the question of fundamental iteration complexity by providing lower bounds to complement the linear (i.e. geometric) upper bounds observed in the literature on a wide class of problems. We cast saddle-point and min-max problems as 2-player games. We leverage tools from single-objective convex optimisation to propose new linear lower bounds for convex-concave games. Notably, we give a linear lower bound for $n$-player differentiable games, by using the spectral properties of the update operator. We then propose a new definition of the condition number arising from our lower bound analysis. Unlike past definitions, our condition number captures the fact that linear rates are possible in games, even in the absence of strong convexity or strong concavity in the variables.

Ioannis Mitliagkas | Gauthier Gidel | Waïss Azizian | Ioannis Mitliagkas | Gauthier Gidel | Adam Ibrahim | Waïss Azizian | Adam Ibrahim

[1] Yangyang Xu,et al. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems , 2018, Math. Program..

[2] Asuman Ozdaglar,et al. An Optimal Multistage Stochastic Gradient Method for Minimax Problems , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[3] Michael I. Jordan,et al. Near-Optimal Algorithms for Minimax Optimization , 2020, COLT.

[4] Ioannis Mitliagkas,et al. Accelerating Smooth Games by Manipulating Spectral Shapes , 2020, AISTATS.

[5] Peter Richtárik,et al. Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods , 2017, Computational Optimization and Applications.

[6] Mark W. Schmidt,et al. Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron , 2018, AISTATS.

[7] Ioannis Mitliagkas,et al. Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[8] Gauthier Gidel,et al. A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[9] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[10] Constantinos Daskalakis,et al. Training GANs with Optimism , 2017, ICLR.

[11] Sebastian Nowozin,et al. The Numerics of GANs , 2017, NIPS.

[12] Lihong Li,et al. Stochastic Variance Reduction Methods for Policy Evaluation , 2017, ICML.

[13] Yunmei Chen,et al. Accelerated schemes for a class of variational inequalities , 2014, Mathematical Programming.

[14] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[15] Francis R. Bach,et al. Stochastic Variance Reduction Methods for Saddle-Point Problems , 2016, NIPS.

[16] Ohad Shamir,et al. On Lower and Upper Bounds in Smooth and Strongly Convex Optimization , 2016, J. Mach. Learn. Res..

[17] Marie Faerber,et al. Banach Algebra Techniques In Operator Theory , 2016 .

[18] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[19] Junsong Yuan,et al. Multi-feature Spectral Clustering with Minimax Optimization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Antonin Chambolle,et al. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[21] Stephen P. Boyd,et al. A minimax theorem with applications to machine learning, signal processing, and finance , 2007, 2007 46th IEEE Conference on Decision and Control.

[22] Fuzhen Zhang. The Schur complement and its applications , 2005 .

[23] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[24] Gene H. Golub,et al. A Preconditioner for Generalized Saddle Point Problems , 2004, SIAM J. Matrix Anal. Appl..

[25] R. Tyrrell Rockafellar,et al. Convergence Rates in Forward-Backward Splitting , 1997, SIAM J. Optim..

[26] Gilles Brassard,et al. Fundamentals of Algorithmics , 1995 .

[27] Arkadi S. Nemirovsky,et al. Information-based complexity of linear operator equations , 1992, J. Complex..

[28] Patrick T. Harker,et al. Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications , 1990, Math. Program..

[29] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[30] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .

[31] G. M. Korpelevich. The extragradient method for finding saddle points and other problems , 1976 .

[32] R. Douglas. Banach Algebra Techniques in Operator Theory , 1972 .

[33] L. Richardson. The Approximate Arithmetical Solution by Finite Differences of Physical Problems Involving Differential Equations, with an Application to the Stresses in a Masonry Dam , 1911 .

[34] L. Richardson,et al. On the Approximate Arithmetical Solution by Finite Differences of Physical Problems Involving Differential Equations, with an Application to the Stresses in a Masonry Dam , 1910 .