Multi-Agent Learning in Network Zero-Sum Games is a Hamiltonian System

Zero-sum games are natural, if informal, analogues of closed physical systems where no energy/utility can enter or exit. This analogy can be extended even further if we consider zero-sum network (polymatrix) games where multiple agents interact in a closed economy. Typically, (network) zero-sum games are studied from the perspective of Nash equilibria. Nevertheless, this comes in contrast with the way we typically think about closed physical systems, e.g., Earth-moon systems which move perpetually along recurrent trajectories of constant energy. We establish a formal and robust connection between multi-agent systems and Hamiltonian dynamics -- the same dynamics that describe conservative systems in physics. Specifically, we show that no matter the size, or network structure of such closed economies, even if agents use different online learning dynamics from the standard class of Follow-the-Regularized-Leader, they yield Hamiltonian dynamics. This approach generalizes the known connection to Hamiltonians for the special case of replicator dynamics in two agent zero-sum games developed by Hofbauer. Moreover, our results extend beyond zero-sum settings and provide a type of a Rosetta stone (see e.g. Table 1) that helps to translate results and techniques between online optimization, convex analysis, games theory, and physics.

[1]  Georgios Piliouras,et al.  Multiplicative Weights Update in Zero-Sum Games , 2018, EC.

[2]  J. Hofbauer Evolutionary dynamics for bimatrix games: A Hamiltonian system? , 1996, Journal of mathematical biology.

[3]  Xiaotie Deng,et al.  Settling the complexity of computing two-player Nash equilibria , 2007, JACM.

[4]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[5]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[6]  Berthold Vöcking,et al.  Inapproximability of pure nash equilibria , 2008, STOC.

[7]  Christos H. Papadimitriou,et al.  Cycles in adversarial regularized learning , 2017, SODA.

[8]  Christos H. Papadimitriou,et al.  On Learning Algorithms for Nash Equilibria , 2010, SAGT.

[9]  JudtHlt JHodillitt 1890 , 1891, The Indian medical gazette.

[10]  Aviad Rubinstein,et al.  Settling the Complexity of Computing Approximate Two-Player Nash Equilibria , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[11]  Henrik I. Christensen,et al.  Persistent patterns: multi-agent learning beyond equilibrium and utility , 2014, AAMAS.

[12]  A. James 2010 , 2011, Philo of Alexandria: an Annotated Bibliography 2007-2016.

[13]  Christos H. Papadimitriou,et al.  On a Network Generalization of the Minmax Theorem , 2009, ICALP.

[14]  Georgios Piliouras,et al.  Three Body Problems in Evolutionary Game Dynamics: Convergence, Periodicity and Limit Cycles , 2018, AAMAS.

[15]  J M Smith,et al.  Evolution and the theory of games , 1976 .

[16]  Aviad Rubinstein,et al.  Inapproximability of Nash Equilibrium , 2014, STOC.

[17]  Jeff S. Shamma,et al.  Optimization Despite Chaos: Convex Relaxations to Complex Limit Sets via Poincaré Recurrence , 2014, SODA.

[18]  J. Hofbauer,et al.  Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[19]  Yang Cai,et al.  Zero-Sum Polymatrix Games: A Generalization of Minmax , 2016, Math. Oper. Res..

[20]  Christos H. Papadimitriou,et al.  From Nash Equilibria to Chain Recurrent Sets: Solution Concepts and Topology , 2016, ITCS.

[21]  William H. Sandholm,et al.  Learning in Games via Reinforcement and Regularization , 2014, Math. Oper. Res..

[22]  E. Tronci,et al.  1996 , 1997, Affair of the Heart.

[23]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[24]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[25]  Georg Ostrovski,et al.  Piecewise linear hamiltonian flows associated to zero-sum games: Transition combinatorics and questions on ergodicity , 2010, 1011.2018.

[26]  Ruta Mehta,et al.  Constant rank bimatrix games are PPAD-hard , 2014, STOC.

[27]  Arnoud Pastink,et al.  On the communication complexity of approximate Nash equilibria , 2012, Games Econ. Behav..

[28]  H. Peyton Young,et al.  Strategic Learning and Its Limits , 2004 .

[29]  Yishay Mansour,et al.  How long to equilibrium? The communication complexity of uncoupled equilibrium procedures , 2010, Games Econ. Behav..

[30]  Yang Cai,et al.  On minmax theorems for multiplayer games , 2011, SODA '11.

[31]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[32]  Éva Tardos,et al.  Beyond the Nash Equilibrium Barrier , 2011, ICS.

[33]  Sebastian van Strien,et al.  Hamiltonian flows with random-walk behaviour originating from zero-sum games and fictitious play , 2011 .

[34]  Eizo Akiyama,et al.  Chaos in learning a simple two-person game , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.