Lyapunov Exponents for Diversity in Differentiable Games

Ridge Rider (RR) is an algorithm for finding diverse solutions to optimization problems by following eigenvectors of the Hessian (“ridges”). RR is designed for conservative gradient systems (i.e., settings involving a single loss function), where it branches at saddles — easy-to-find bifurcation points. We generalize this idea to nonconservative, multi-agent gradient systems by proposing a method – denoted Generalized Ridge Rider (GRR) – for finding arbitrary bifurcation points. We give theoretical motivation for our method by leveraging machinery from the field of dynamical systems. We construct novel toy problems where we can visualize new phenomena while giving insight into high-dimensional problems of interest. Finally, we empirically evaluate our method by finding diverse solutions in the iterated prisoners’ dilemma and relevant machine learning problems including generative adversarial networks. ACM Reference Format: Jonathan Lorraine, Paul Vicol, Jack Parker-Holder, Tal Kachman, Luke Metz, and Jakob Foerster. 2022. Lyapunov Exponents for Diversity in Differentiable Games. In Proc. of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2022), Online, May 9–13, 2022, IFAAMAS, 24 pages.

[1]  Jaime Fern'andez del R'io,et al.  Array programming with NumPy , 2020, Nature.

[2]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Julian Togelius,et al.  AlphaStar: an evolutionary computation perspective , 2019, GECCO.

[4]  Georgios Piliouras,et al.  Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent , 2019, COLT.

[5]  K. Choromanski,et al.  Effective Diversity in Population-Based Reinforcement Learning , 2020, NeurIPS.

[6]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[7]  Travis E. Oliphant,et al.  Guide to NumPy , 2015 .

[8]  Eizo Akiyama,et al.  Chaos in learning a simple two-person game , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Leslie Pack Kaelbling,et al.  Monte Carlo Tree Search in Continuous Spaces Using Voronoi Optimistic Optimization with Regret Bounds , 2020, AAAI.

[10]  Jakob N. Foerster,et al.  Using Bifurcations for Diversity in Differentiable Games , 2021 .

[11]  Yun Kuen Cheung,et al.  Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions , 2020, ICLR.

[12]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[13]  Geoffrey E. Hinton,et al.  Teaching with Commentaries , 2020, ICLR.

[14]  M. Bethge,et al.  Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[15]  Jakob N. Foerster,et al.  "Other-Play" for Zero-Shot Coordination , 2020, ICML.

[16]  Shimon Whiteson,et al.  Learning with Opponent-Learning Awareness , 2017, AAMAS.

[17]  Yun Kuen Cheung,et al.  Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in Zero-Sum Games , 2019, COLT.

[18]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[19]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[20]  G. Piliouras,et al.  The route to chaos in routing games: When is price of anarchy too optimistic? , 2019, NeurIPS.

[21]  Houman Owhadi,et al.  Competitive Mirror Descent , 2020, ArXiv.

[22]  Ka-Chun Wong,et al.  Evolutionary Multimodal Optimization: A Short Survey , 2015, ArXiv.

[23]  David Pfau,et al.  Connecting Generative Adversarial Networks and Actor-Critic Methods , 2016, ArXiv.

[24]  William L. Hamilton,et al.  Adversarial Example Games , 2020, NeurIPS.

[25]  G. Paladin,et al.  Characterization of chaos in random maps , 1996 .

[26]  Suvrit Sra,et al.  Diversity Networks: Neural Network Compression Using Determinantal Point Processes , 2015, 1511.05077.

[27]  Max Jaderberg,et al.  Open-ended Learning in Symmetric Zero-sum Games , 2019, ICML.

[28]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[29]  Pierre-Luc Bacon,et al.  A Lagrangian Method for Inverse Problems in Reinforcement Learning , 2019 .

[30]  Vikash Kumar,et al.  A Game Theoretic Framework for Model Based Reinforcement Learning , 2020, ICML.

[31]  Igor Mordatch,et al.  Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.

[32]  Tal Kachman,et al.  Numerical implementation of the multiscale and averaging methods for quasi periodic systems , 2017, Comput. Phys. Commun..

[33]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[34]  Stefanos Leonardos,et al.  Catastrophe by Design in Population Games: Destabilizing Wasteful Locked-in Technologies , 2020, ArXiv.

[35]  Ioannis Mitliagkas,et al.  Negative Momentum for Improved Game Dynamics , 2018, AISTATS.

[36]  Michael W. Mahoney,et al.  PyHessian: Neural Networks Through the Lens of the Hessian , 2019, 2020 IEEE International Conference on Big Data (Big Data).

[37]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[38]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[39]  Nancy M. Amato,et al.  An obstacle-based rapidly-exploring random tree , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[40]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[41]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[42]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[43]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[44]  Sergey Levine,et al.  Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.

[45]  Justin Domke,et al.  Generic Methods for Optimization-Based Modeling , 2012, AISTATS.

[46]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[47]  Abhishek Kumar,et al.  Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[48]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[49]  Gauthier Gidel,et al.  A Variational Inequality Perspective on Generative Adversarial Networks , 2018, ICLR.

[50]  Ning Chen,et al.  Improving Adversarial Robustness via Promoting Ensemble Diversity , 2019, ICML.

[51]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[52]  G. M. Korpelevich The extragradient method for finding saddle points and other problems , 1976 .

[53]  Kalyanmoy Deb,et al.  Comparison of multi-modal optimization algorithms based on evolutionary algorithms , 2006, GECCO.

[54]  Roger B. Grosse,et al.  Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions , 2019, ICLR.

[55]  Georgios Piliouras,et al.  Follow-the-Regularized-Leader Routes to Chaos in Routing Games , 2021, ICML.

[56]  David Duvenaud,et al.  Meta-Learning to Improve Pre-Training , 2021, NeurIPS.

[57]  Rishabh Agarwal,et al.  Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation , 2021, AAAI.

[58]  Andreas S. Tolias,et al.  Flexible Few-Shot Learning of Contextual Similarity , 2020 .

[59]  David Duvenaud,et al.  Complex Momentum for Learning in Games , 2021, ArXiv.

[60]  Y. Pesin CHARACTERISTIC LYAPUNOV EXPONENTS AND SMOOTH ERGODIC THEORY , 1977 .

[61]  B. Abramson The expected-outcome model of two-player games , 1990 .

[62]  Guido Rossum,et al.  Python Reference Manual , 2000 .

[63]  Georgios Piliouras,et al.  Bifurcation Mechanism Design - From Optimal Flat Taxes to Better Cancer Treatments , 2018, Games.

[64]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[65]  Ilya Kostrikov,et al.  Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.

[66]  David Duvenaud,et al.  Stochastic Hyperparameter Optimization through Hypernetworks , 2018, ArXiv.

[67]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[68]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[69]  Roger B. Grosse,et al.  Implicit Regularization in Overparameterized Bilevel Optimization , 2021 .

[70]  Georgios Piliouras,et al.  From Chaos to Order: Symmetry and Conservation Laws in Game Dynamics , 2020, ICML.

[71]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[72]  Hengyuan Hu,et al.  Trajectory Diversity for Zero-Shot Coordination , 2021, AAMAS.

[73]  Ioannis Mitliagkas,et al.  A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games , 2020, AISTATS.

[74]  David Duvenaud,et al.  Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[75]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[76]  Steven M. LaValle,et al.  Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[77]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[78]  Aldo Pacchiano,et al.  Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian , 2020, NeurIPS.

[79]  Rodney C. Wolff,et al.  Local Lyapunov Exponents: Looking Closely at Chaos , 1992 .

[80]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[81]  Travis E. Oliphant,et al.  Python for Scientific Computing , 2007, Computing in Science & Engineering.

[82]  Barak A. Pearlmutter Fast Exact Multiplication by the Hessian , 1994, Neural Computation.

[83]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[84]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[85]  Thore Graepel,et al.  Differentiable Game Mechanics , 2019, J. Mach. Learn. Res..

[86]  George Adam,et al.  Understanding Neural Architecture Search Techniques , 2019, ArXiv.

[87]  S. LaValle Rapidly-exploring random trees : a new tool for path planning , 1998 .

[88]  W. E. Fann Prisoner's Dilemma: John von Neumann, Game Theory, and the Puzzle of the Bomb , 1993 .

[89]  Yaodong Yang,et al.  Multi-Agent Determinantal Q-Learning , 2020, ICML.

[90]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[91]  J. Zukas Introduction to the Modern Theory of Dynamical Systems , 1998 .

[92]  Ryan P. Adams,et al.  Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.

[93]  Junhyuk Oh,et al.  Pick Your Battles: Interaction Graphs as Population-Level Objectives for Strategic Diversity , 2021, AAMAS.

[94]  S. Shankar Sastry,et al.  On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[95]  M. Tabor Chaos and Integrability in Nonlinear Dynamics: An Introduction , 1989 .

[96]  Kenneth O. Stanley,et al.  Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.

[97]  Finale Doshi-Velez,et al.  Ensembles of Locally Independent Prediction Models , 2019, AAAI.

[98]  E. C. Zeeman,et al.  Population dynamics from game theory , 1980 .

[99]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[100]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.

[101]  Diversity inducing Information Bottleneck in Model Ensembles , 2020 .

[102]  Catholijn M. Jonker,et al.  A0C: Alpha Zero in Continuous Action Space , 2018, ArXiv.

[103]  Kenneth O. Stanley,et al.  Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.

[104]  Shimon Whiteson,et al.  Stable Opponent Shaping in Differentiable Games , 2018, ICLR.

[105]  Richard S. Zemel,et al.  Gradient-based Optimization of Neural Network Architecture , 2018, ICLR.

[106]  Georgios Piliouras,et al.  Average Case Performance of Replicator Dynamics in Potential Games via Computing Regions of Attraction , 2014, EC.