论文信息 - A mean-field analysis of two-player zero-sum games - 字舞流文

A mean-field analysis of two-player zero-sum games

Finding Nash equilibria in two-player zero-sum continuous games is a central problem in machine learning, e.g. for training both GANs and robust models. The existence of pure Nash equilibria requires strong conditions which are not typically met in practice. Mixed Nash equilibria exist in greater generality and may be found using mirror descent. Yet this approach does not scale to high dimensions. To address this limitation, we parametrize mixed strategies as mixtures of particles, whose positions and weights are updated using gradient descent-ascent. We study this dynamics as an interacting gradient flow over measure spaces endowed with the Wasserstein-Fisher-Rao metric. We establish global convergence to an approximate equilibrium for the related Langevin gradient-ascent dynamic. We prove a law of large numbers that relates particle dynamics to mean-field dynamics. Our method identifies mixed equilibria in high dimensions and is demonstrably effective for training mixtures of GANs.

Joan Bruna | Arthur Mensch | Grant M. Rotskoff | Carles Domingo-Enrich | Samy Jelassi | Grant Rotskoff | Joan Bruna | A. Mensch | Carles Domingo-Enrich | Samy Jelassi

[1] Lénaïc Chizat. Sparse optimization on measures with over-parameterized gradient descent , 2019, Mathematical Programming.

[2] Alexandros G. Dimakis,et al. SGD Learns One-Layer Networks in WGANs , 2019, ICML.

[3] Justin A. Sirignano,et al. Mean field analysis of neural networks: A central limit theorem , 2018, Stochastic Processes and their Applications.

[4] [EPUB] Stochastic Processes And Applications Diffusion Processes The Fokker Planck And Langevin Equations Texts In Applied Mathematics , 2020 .

[5] Lillian J. Ratliff,et al. Convergence of Learning Dynamics in Stackelberg Games , 2019, ArXiv.

[6] Jason D. Lee,et al. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.

[7] Joan Bruna,et al. Global convergence of neuron birth-death dynamics , 2019, ICML 2019.

[8] Michael I. Jordan,et al. Minmax Optimization: Stable Limit Points of Gradient Descent Ascent are Locally Optimal , 2019, ArXiv.

[9] S. Shankar Sastry,et al. On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[10] Volkan Cevher,et al. Finding Mixed Nash Equilibria of Generative Adversarial Networks , 2018, ICML.

[11] Chuan-Sheng Foo,et al. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[12] Thomas Hofmann,et al. Local Saddle Point Optimization: A Curvature Exploitation Approach , 2018, AISTATS.

[13] Gauthier Gidel,et al. A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[14] Jason D. Lee,et al. Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition , 2018, ArXiv.

[15] Mingrui Liu,et al. Solving Weakly-Convex-Weakly-Concave Saddle-Point Problems as Weakly-Monotone Variational Inequality , 2018 .

[16] Matthijs Douze,et al. Deep Clustering for Unsupervised Learning of Visual Features , 2018, ECCV.

[17] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.

[18] Grant M. Rotskoff,et al. Neural Networks as Interacting Particle Systems: Asymptotic Convexity of the Loss Landscape and Universal Scaling of the Approximation Error , 2018, ArXiv.

[19] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.

[20] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[21] Andreas Krause,et al. An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[22] Philip H. S. Torr,et al. Multi-agent Diverse Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23] A. Eberle,et al. Quantitative Harris-type theorems for diffusions and McKean–Vlasov processes , 2016, Transactions of the American Mathematical Society.

[24] François-Xavier Vialard,et al. An Interpolating Distance Between Optimal Transport and Fisher–Rao Metrics , 2010, Foundations of Computational Mathematics.

[25] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[26] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[27] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[28] Yingyu Liang,et al. Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[29] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.

[30] Thomas O. Gallouët,et al. A JKO Splitting Scheme for Kantorovich-Fisher-Rao Gradient Flows , 2016, SIAM J. Math. Anal..

[31] F. Santambrogio. {Euclidean, metric, and Wasserstein} gradient flows: an overview , 2016, 1609.03890.

[32] S. Kondratyev,et al. A new optimal transport distance on the space of finite Radon measures , 2015, Advances in Differential Equations.

[33] Alexandre M. Bayen,et al. Minimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games , 2016, NIPS.

[34] Giuseppe Savaré,et al. Optimal Entropy-Transport problems and a new Hellinger–Kantorovich distance between positive measures , 2015, 1508.07941.

[35] G. Peyré,et al. Unbalanced Optimal Transport: Geometry and Kantorovich Formulation , 2015, 1508.05216.

[36] Wen Huang,et al. Steady States of Fokker–Planck Equations: I. Existence , 2015 .

[37] Alessio Porretta,et al. Weak Solutions to Fokker–Planck Equations and Mean Field Games , 2015 .

[38] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[39] L. F. Abbott,et al. Hierarchical Control Using Networks Trained with Higher-Level Forward Models , 2014, Neural Computation.

[40] G. Chirikjian. Stochastic models, information theory, and lie groups , 2012 .

[41] L. Buşoniu,et al. A comprehensive survey of multi-agent reinforcement learning , 2011 .

[42] A. Juditsky,et al. Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.

[43] Paul W. Goldberg,et al. The complexity of computing a Nash equilibrium , 2006, STOC '06.

[44] L. Ambrosio,et al. Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[45] C. Villani,et al. ON THE TREND TO EQUILIBRIUM FOR THE FOKKER-PLANCK EQUATION : AN INTERPLAY BETWEEN PHYSICS AND FUNCTIONAL ANALYSIS , 2004 .

[46] Arkadi Nemirovski,et al. Prox-Method with Rate of Convergence O(1/t) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-Concave Saddle Point Problems , 2004, SIAM J. Optim..

[47] V. Bogachev,et al. ON REGULARITY OF TRANSITION PROBABILITIES AND INVARIANT MEASURES OF SINGULAR DIFFUSIONS UNDER MINIMAL CONDITIONS , 2001 .

[48] E. Damme,et al. Non-Cooperative Games , 2000 .

[49] A. Sznitman. Topics in propagation of chaos , 1991 .

[50] Edward C. Posner,et al. Random coding strategies for minimum entropy , 1975, IEEE Trans. Inf. Theory.

[51] J. Serrin,et al. Local behavior of solutions of quasilinear parabolic equations , 1967 .

[52] J. Goodman. Note on Existence and Uniqueness of Equilibrium Points for Concave N-Person Games , 1965 .

[53] H. Nikaidô,et al. Note on non-cooperative convex game , 1955 .

[54] I. Glicksberg. A FURTHER GENERALIZATION OF THE KAKUTANI FIXED POINT THEOREM, WITH APPLICATION TO NASH EQUILIBRIUM POINTS , 1952 .