From Chaos to Order: Symmetry and Conservation Laws in Game Dynamics

Games are an increasingly useful tool for training and testing learning algorithms. Recent examples include GANs, AlphaZero and the AlphaStar league. However, multi-agent learning can be extremely difficult to predict and control. Learning dynamics even in simple games can yield chaotic behavior. In this paper, we present basic mechanism design tools for constructing games with predictable and controllable dynamics. We show that arbitrarily large and complex network games, encoding both cooperation (team play) and competition (zero-sum interaction), exhibit conservation laws when agents use the standard regret-minimizing dynamics known as Followthe-Regularized-Leader. These laws persist when different agents use different dynamics and encode long-range correlations between agents’ behavior, even though the agents may not interact directly. Moreover, we provide sufficient conditions under which the dynamics have multiple, linearly independent, conservation laws. Increasing the number of conservation laws results in more predictable dynamics, eventually making chaotic behavior formally impossible in some cases.

[1]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[2]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[3]  Jeff S. Shamma,et al.  Optimization Despite Chaos: Convex Relaxations to Complex Limit Sets via Poincaré Recurrence , 2014, SODA.

[4]  Constantinos Daskalakis,et al.  The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[5]  Ioannis Mitliagkas,et al.  Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[6]  I. Bendixson Sur les courbes définies par des équations différentielles , 1901 .

[7]  L. Barreira Poincaré recurrence:. old and new , 2006 .

[8]  Christos H. Papadimitriou,et al.  Cycles in adversarial regularized learning , 2017, SODA.

[9]  Michael L. Littman,et al.  Graphical Models for Game Theory , 2001, UAI.

[10]  Yun Kuen Cheung,et al.  Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in Zero-Sum Games , 2019, COLT.

[11]  Georgios Piliouras,et al.  Average Case Performance of Replicator Dynamics in Potential Games via Computing Regions of Attraction , 2014, EC.

[12]  G. Piliouras,et al.  Family of chaotic maps from game theory , 2018, 1807.06831.

[13]  Georgios Piliouras,et al.  Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent , 2019, COLT.

[14]  P. Taylor,et al.  Evolutionarily Stable Strategies and Game Dynamics , 1978 .

[15]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[16]  Chuan-Sheng Foo,et al.  Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.

[17]  T. Sideris Ordinary Differential Equations and Dynamical Systems , 2013 .

[18]  Thomas Hofmann,et al.  Local Saddle Point Optimization: A Curvature Exploitation Approach , 2018, AISTATS.

[19]  Jakub W. Pachocki,et al.  Emergent Complexity via Multi-Agent Competition , 2017, ICLR.

[20]  Gerald Tesauro,et al.  Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..

[21]  H. F. Bohnenblust,et al.  Solutions of Discrete, Two-Person Games , 1949 .

[22]  Georgios Piliouras,et al.  Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games , 2019, NeurIPS.

[23]  Eizo Akiyama,et al.  Chaos in learning a simple two-person game , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[25]  Georgios Piliouras,et al.  Multiplicative Weights Update in Zero-Sum Games , 2018, EC.

[26]  Éva Tardos,et al.  Multiplicative updates outperform generic no-regret learning in congestion games: extended abstract , 2009, STOC '09.

[27]  Guy Lever,et al.  Emergent Coordination Through Competition , 2019, ICLR.

[28]  Yang Cai,et al.  Zero-Sum Polymatrix Games: A Generalization of Minmax , 2016, Math. Oper. Res..

[29]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[30]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[31]  Yang Cai,et al.  On minmax theorems for multiplayer games , 2011, SODA '11.

[32]  William H. Sandholm,et al.  Population Games And Evolutionary Dynamics , 2010, Economic learning and social evolution.

[33]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[34]  Josef Hofbauer,et al.  Time Average Replicator and Best-Reply Dynamics , 2009, Math. Oper. Res..

[35]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[36]  Christos H. Papadimitriou,et al.  On a Network Generalization of the Minmax Theorem , 2009, ICALP.

[37]  Georgios Piliouras,et al.  Three Body Problems in Evolutionary Game Dynamics: Convergence, Periodicity and Limit Cycles , 2018, AAMAS.

[38]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[39]  Georgios Piliouras,et al.  Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos , 2017, NIPS.

[40]  Tobias Galla,et al.  Complex dynamics in learning complicated games , 2011, Proceedings of the National Academy of Sciences.

[41]  Volkan Cevher,et al.  Finding Mixed Nash Equilibria of Generative Adversarial Networks , 2018, ICML.