论文信息 - For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

Although it has been known since the 1970s that a globally optimal strategy profile in a common-payoff game is a Nash equilibrium, global optimality is a strict requirement that limits the re-sult’s applicability. In this work, we show that any locally optimal symmetric strategy profile is also a (global) Nash equilibrium. Furthermore, we show that this result is robust to perturbations to the common payoff and to the local optimum. Applied to machine learning, our result provides a global guarantee for any gradient method that finds a local optimum in symmetric strategy space. While this result indicates stability to unilateral deviation, we nevertheless identify broad classes of games where mixed local optima are unstable under joint , asymmetric deviations. We analyze the prevalence of instability by running learning algorithms in a suite of symmetric games, and we conclude by discussing the applicability of our results to multi-agent RL, cooperative inverse RL, and decentralized POMDPs.

[1] Asaf Plan,et al. Symmetry in n-player games , 2022, J. Econ. Theory.

[2] Nicola Gatti,et al. Public Information Representation for Adversarial Team Games , 2022, ArXiv.

[3] Michael Dennis,et al. A New Formalism, Method and Open Issues for Zero-Shot Coordination , 2021, ICML.

[4] Gillian K. Hadfield,et al. Cooperative AI: machines must learn to find common ground , 2021, Nature.

[5] H. W. Kuhn. EXTENSIVE GAMES AND THE PROBLEM OF INFORMATION , 2020, Classics in Game Theory.

[6] Jaime Fern'andez del R'io,et al. Array programming with NumPy , 2020, Nature.

[7] Michael P. Wellman,et al. Structure Learning for Approximate Solution of Many-Player Games , 2020, AAAI.

[8] Jakob N. Foerster,et al. "Other-Play" for Zero-Shot Coordination , 2020, ICML.

[9] Joel Nothman,et al. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[10] S. Levine,et al. RoboNet: Large-Scale Multi-Robot Learning , 2019, Conference on Robot Learning.

[11] Stuart Russell. Human Compatible: Artificial Intelligence and the Problem of Control , 2019 .

[12] Nicola Gatti,et al. Computational Results for Extensive-Form Adversarial Team Games , 2017, AAAI.

[13] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[14] Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.

[15] Andy R. Terrel,et al. SymPy: Symbolic computing in Python , 2017, PeerJ Prepr..

[16] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.

[17] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.

[18] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[19] A. Sandberg,et al. The Unilateralist’s Curse and the Case for a Principle of Conformity , 2016, Social epistemology.

[20] S. Shankar Sastry,et al. On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[21] Wolfgang Schwarz,et al. Lost memories and useless coins: revisiting the absentminded driver , 2015, Synthese.

[22] Nicholas Ham,et al. Notions of Symmetry for Finite Strategic-Form Games , 2013 .

[23] Nicolas Markey,et al. Symmetric Nash Equilibria , 2012 .

[24] I. Milchtaich. Static Stability in Symmetric and Population Games , 2011 .

[25] Sarit Kraus,et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.

[26] Wes McKinney,et al. Data Structures for Statistical Computing in Python , 2010, SciPy.

[27] Takashi Ui. Bayesian potentials and information structures: Team decision problems revisited , 2009 .

[28] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[29] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .

[30] Christos H. Papadimitriou,et al. Computing Equilibria in Anonymous Games , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[31] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[32] B. Stengel. Algorithmic Game Theory: Equilibrium Computation for Two-Player Games in Strategic and Extensive Form , 2007 .

[33] Felix A. Fischer,et al. Symmetries and the complexity of pure Nash equilibrium , 2007, J. Comput. Syst. Sci..

[34] Yoav Shoham,et al. Run the GAMUT: a comprehensive approach to evaluating game-theoretic algorithms , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[35] Bernard Manderick,et al. Extended Replicator Dynamics as a Key to Reinforcement Learning in Multi-agent Systems , 2003, ECML.

[36] William H. Sandholm,et al. Potential Games with Continuous Player Sets , 2001, J. Econ. Theory.

[37] E. Damme,et al. Non-Cooperative Games , 2000 .

[38] P. Reny. On the Existence of Pure and Mixed Strategy Nash Equilibria in Discontinuous Games , 1999 .

[39] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[40] Tilman Börgers,et al. Learning Through Reinforcement and Replicator Dynamics , 1997 .

[41] B. Stengel,et al. Team-Maxmin Equilibria☆ , 1997 .

[42] Ariel Rubinstein,et al. On the Interpretation of Decision Problems with Imperfect Recall , 1996, TARK.

[43] Sergiu Hart,et al. The Absent-Minded Driver , 1996, TARK.

[44] I. Introduction. CAN THE MAXIMIN PRINCIPLE SERVE AS A BASIS FOR MORALITY? A CRITIQUE OF JOHN RAWLS'S THEORY*§ , 1980 .

[45] R. Radner,et al. Economic theory of teams , 1972 .

[46] J. Rawls,et al. A Theory of Justice , 1971, Princeton Readings in Political Thought.

[47] Robert J . Aumann,et al. 28. Mixed and Behavior Strategies in Infinite Extensive Games , 1964 .

[48] J. Marschak,et al. Elements for a Theory of Teams , 1955 .

[49] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.