Solving Structured Hierarchical Games Using Differential Backward Induction

From large-scale organizations to decentralized political systems, hierarchical strategic decision making is commonplace. We introduce a novel class of structured hierarchical games (SHGs) that formally capture such hierarchical strategic interactions. In an SHG, each player is a vertex in a tree, and strategic choices of players are sequenced from root to leaves, with root moving first, followed by its children, then followed by their children, and so on until the leaves. A player’s utility in an SHG depends on its own decision, and on the choices of its parent and all the tree leaves. SHGs thus generalize simultaneous-move games, as well as Stackelberg games with many followers. We leverage the structure of both the sequence of player moves as well as payoff dependence to develop a novel gradient-based back propagation-style algorithm, which we call Differential Backward Induction (DBI), for approximating equilibria of SHGs. We then provide a sufficient condition for convergence of DBI. Finally, we demonstrate the efficacy of the proposed algorithmic approach in finding approximate equilibrium solutions to several classes of SHGs.

[1]  Jacob Abernethy,et al.  Last-iterate convergence rates for min-max optimization , 2019, ALT.

[2]  Vikash Kumar,et al.  A Game Theoretic Framework for Model Based Reinforcement Learning , 2020, ICML.

[3]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[4]  Michael Hill,et al.  The Public Policy Process , 2005 .

[5]  J. Zico Kolter,et al.  Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games , 2019, AAAI.

[6]  J. Zico Kolter,et al.  Gradient descent GAN optimization is locally stable , 2017, NIPS.

[7]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[8]  S. Shankar Sastry,et al.  On Gradient-Based Learning in Continuous Games , 2018, SIAM J. Math. Data Sci..

[9]  Sven Leyffer,et al.  Solving Multi-Leader-Follower Games , 2005 .

[10]  David Duvenaud,et al.  Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[11]  Milind Tambe,et al.  Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[12]  Yevgeniy Vorobeychik,et al.  A Game-Theoretic Approach for Hierarchical Policy-Making , 2021, ArXiv.

[13]  Zhengyuan Zhou,et al.  Learning in games with continuous action sets and unknown payoff functions , 2019, Math. Program..

[14]  Vincent Conitzer,et al.  Computing optimal strategies to commit to in extensive-form games , 2010, EC '10.

[15]  Stefano Coniglio,et al.  Methods for Finding Leader-Follower Equilibria with Multiple Followers: (Extended Abstract) , 2016, AAMAS.

[16]  Lillian J. Ratliff,et al.  Convergence Analysis of Gradient-Based Learning in Continuous Games , 2019, UAI.

[17]  Oded Galor,et al.  Discrete Dynamical Systems , 2005 .

[18]  Milind Tambe,et al.  End to end learning and optimization on graphs , 2019, NeurIPS.

[19]  M. Shub Global Stability of Dynamical Systems , 1986 .

[20]  Thanh H. Nguyen,et al.  Partial Adversarial Behavior Deception in Security Games , 2020, IJCAI.

[21]  Alistair Letcher,et al.  On the Impossibility of Global Convergence in Multi-Loss Optimization , 2020, ICLR.

[22]  Jing Yu,et al.  End-to-End Learning and Intervention in Games , 2020, NeurIPS.

[23]  Tamer Basar,et al.  Distributed algorithms for the computation of noncooperative equilibria , 1987, Autom..

[24]  Ioannis Mitliagkas,et al.  Stochastic Hamiltonian Gradient Methods for Smooth Games , 2020, ICML.

[25]  Tuomas Sandholm,et al.  Discretization of Continuous Action Spaces in Extensive-Form Games , 2015, AAMAS.

[26]  Vladlen Koltun,et al.  Deep Equilibrium Models , 2019, NeurIPS.

[27]  Nitakshi Goyal,et al.  General Topology-I , 2017 .

[28]  R. Rockafellar,et al.  Implicit Functions and Solution Mappings , 2009 .

[29]  Stephen P. Boyd,et al.  Differentiable Convex Optimization Layers , 2019, NeurIPS.

[30]  Byron Boots,et al.  Truncated Back-propagation for Bilevel Optimization , 2018, AISTATS.

[31]  Shahin Jabbari,et al.  Modeling between-population variation in COVID-19 dynamics in Hubei, Lombardy, and New York City , 2020, Proceedings of the National Academy of Sciences.

[32]  Marcello Restelli,et al.  Equilibrium approximation in simulation-based extensive-form games , 2011, AAMAS.

[33]  Michael I. Jordan,et al.  What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[34]  Tanner Fiez,et al.  Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study , 2020, ICML.

[35]  Michael L. Littman,et al.  Graphical Models for Game Theory , 2001, UAI.

[36]  Paul W. Goldberg,et al.  The complexity of computing a Nash equilibrium , 2006, STOC '06.

[37]  Heinrich von Stackelberg,et al.  Stackelberg (Heinrich von) - The Theory of the Market Economy, translated from the German and with an introduction by Alan T. PEACOCK. , 1953 .

[38]  Michael P. Wellman,et al.  Gradient methods for stackelberg security games , 2016, UAI 2016.

[39]  Guodong Zhang,et al.  On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach , 2019, ICLR.

[40]  Subhransu Maji,et al.  Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[42]  Thore Graepel,et al.  The Mechanics of n-Player Differentiable Games , 2018, ICML.

[43]  Ioannis Mitliagkas,et al.  Linear Lower Bounds and Conditioning of Differentiable Games , 2019, ICML.

[44]  Gal Chechik,et al.  Auxiliary Learning by Implicit Differentiation , 2020, ArXiv.

[45]  Constantinos Daskalakis,et al.  The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[47]  Michael H. Bowling,et al.  Finding Optimal Abstract Strategies in Extensive-Form Games , 2012, AAAI.

[48]  J. Zico Kolter,et al.  What game are we playing? End-to-end learning in normal and extensive form games , 2018, IJCAI.

[49]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[50]  Anoop Cherian,et al.  On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization , 2016, ArXiv.

[51]  S. Wiggins Introduction to Applied Nonlinear Dynamical Systems and Chaos , 1989 .