论文信息 - Solving Structured Hierarchical Games Using Differential Backward Induction - 字舞流文

Solving Structured Hierarchical Games Using Differential Backward Induction

From large-scale organizations to decentralized political systems, hierarchical strategic decision making is commonplace. We introduce a novel class of structured hierarchical games (SHGs) that formally capture such hierarchical strategic interactions. In an SHG, each player is a vertex in a tree, and strategic choices of players are sequenced from root to leaves, with root moving first, followed by its children, then followed by their children, and so on until the leaves. A player’s utility in an SHG depends on its own decision, and on the choices of its parent and all the tree leaves. SHGs thus generalize simultaneous-move games, as well as Stackelberg games with many followers. We leverage the structure of both the sequence of player moves as well as payoff dependence to develop a novel gradient-based back propagation-style algorithm, which we call Differential Backward Induction (DBI), for approximating equilibria of SHGs. We then provide a sufficient condition for convergence of DBI. Finally, we demonstrate the efficacy of the proposed algorithmic approach in finding approximate equilibrium solutions to several classes of SHGs.

Milind Tambe | Yevgeniy Vorobeychik | Mithun Chakraborty | Shahin Jabbari | Feiran Jia | Aditya Mate | Zun Li

[1] Jacob Abernethy,et al. Last-iterate convergence rates for min-max optimization , 2019, ALT.

[2] Vikash Kumar,et al. A Game Theoretic Framework for Model Based Reinforcement Learning , 2020, ICML.

[3] J. Zico Kolter,et al. OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[4] Michael Hill,et al. The Public Policy Process , 2005 .

[5] J. Zico Kolter,et al. Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games , 2019, AAAI.

[6] J. Zico Kolter,et al. Gradient descent GAN optimization is locally stable , 2017, NIPS.

[7] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[8] S. Shankar Sastry,et al. On Gradient-Based Learning in Continuous Games , 2018, SIAM J. Math. Data Sci..

[9] Sven Leyffer,et al. Solving Multi-Leader-Follower Games , 2005 .

[10] David Duvenaud,et al. Optimizing Millions of Hyperparameters by Implicit Differentiation , 2019, AISTATS.

[11] Milind Tambe,et al. Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization , 2018, AAAI.

[12] Yevgeniy Vorobeychik,et al. A Game-Theoretic Approach for Hierarchical Policy-Making , 2021, ArXiv.

[13] Zhengyuan Zhou,et al. Learning in games with continuous action sets and unknown payoff functions , 2019, Math. Program..

[14] Vincent Conitzer,et al. Computing optimal strategies to commit to in extensive-form games , 2010, EC '10.

[15] Stefano Coniglio,et al. Methods for Finding Leader-Follower Equilibria with Multiple Followers: (Extended Abstract) , 2016, AAMAS.

[16] Lillian J. Ratliff,et al. Convergence Analysis of Gradient-Based Learning in Continuous Games , 2019, UAI.

[17] Oded Galor,et al. Discrete Dynamical Systems , 2005 .

[18] Milind Tambe,et al. End to end learning and optimization on graphs , 2019, NeurIPS.

[19] M. Shub. Global Stability of Dynamical Systems , 1986 .

[20] Thanh H. Nguyen,et al. Partial Adversarial Behavior Deception in Security Games , 2020, IJCAI.

[21] Alistair Letcher,et al. On the Impossibility of Global Convergence in Multi-Loss Optimization , 2020, ICLR.

[22] Jing Yu,et al. End-to-End Learning and Intervention in Games , 2020, NeurIPS.

[23] Tamer Basar,et al. Distributed algorithms for the computation of noncooperative equilibria , 1987, Autom..

[24] Ioannis Mitliagkas,et al. Stochastic Hamiltonian Gradient Methods for Smooth Games , 2020, ICML.

[25] Tuomas Sandholm,et al. Discretization of Continuous Action Spaces in Extensive-Form Games , 2015, AAMAS.

[26] Vladlen Koltun,et al. Deep Equilibrium Models , 2019, NeurIPS.

[27] Nitakshi Goyal,et al. General Topology-I , 2017 .

[28] R. Rockafellar,et al. Implicit Functions and Solution Mappings , 2009 .

[29] Stephen P. Boyd,et al. Differentiable Convex Optimization Layers , 2019, NeurIPS.

[30] Byron Boots,et al. Truncated Back-propagation for Bilevel Optimization , 2018, AISTATS.

[31] Shahin Jabbari,et al. Modeling between-population variation in COVID-19 dynamics in Hubei, Lombardy, and New York City , 2020, Proceedings of the National Academy of Sciences.

[32] Marcello Restelli,et al. Equilibrium approximation in simulation-based extensive-form games , 2011, AAMAS.

[33] Michael I. Jordan,et al. What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.

[34] Tanner Fiez,et al. Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study , 2020, ICML.

[35] Michael L. Littman,et al. Graphical Models for Game Theory , 2001, UAI.

[36] Paul W. Goldberg,et al. The complexity of computing a Nash equilibrium , 2006, STOC '06.

[37] Heinrich von Stackelberg,et al. Stackelberg (Heinrich von) - The Theory of the Market Economy, translated from the German and with an introduction by Alan T. PEACOCK. , 1953 .

[38] Michael P. Wellman,et al. Gradient methods for stackelberg security games , 2016, UAI 2016.

[39] Guodong Zhang,et al. On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach , 2019, ICLR.

[40] Subhransu Maji,et al. Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41] Sebastian Nowozin,et al. The Numerics of GANs , 2017, NIPS.

[42] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.

[43] Ioannis Mitliagkas,et al. Linear Lower Bounds and Conditioning of Differentiable Games , 2019, ICML.

[44] Gal Chechik,et al. Auxiliary Learning by Implicit Differentiation , 2020, ArXiv.

[45] Constantinos Daskalakis,et al. The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization , 2018, NeurIPS.

[47] Michael H. Bowling,et al. Finding Optimal Abstract Strategies in Extensive-Form Games , 2012, AAAI.

[48] J. Zico Kolter,et al. What game are we playing? End-to-end learning in normal and extensive form games , 2018, IJCAI.

[49] Sergey Levine,et al. Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[50] Anoop Cherian,et al. On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization , 2016, ArXiv.

[51] S. Wiggins. Introduction to Applied Nonlinear Dynamical Systems and Chaos , 1989 .