Convergence Analysis of Gradient-Based Learning in Continuous Games

Considering a class of gradient-based multiagent learning algorithms in non-cooperative settings, we provide convergence guarantees to a neighborhood of a stable Nash equilibrium. In particular, we consider continuous games where agents learn in 1) deterministic settings with oracle access to their individual gradient and 2) stochastic settings with an unbiased estimator of their individual gradient. We also study the effects of non-uniform learning rates, which cause a distortion of the vector field that can alter the equilibrium to which the agents converge and the learning path. We support the analysis with numerical examples that provide insight into how games may be synthesized to achieve desirable equilibria.

[1]  A. Ostrowski Solution of equations and systems of equations , 1967 .

[2]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[3]  J. Hofbauer Evolutionary dynamics for bimatrix games: A Hamiltonian system? , 1996, Journal of mathematical biology.

[4]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[5]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[6]  M. Benaïm Dynamics of stochastic approximation algorithms , 1999 .

[7]  M. Hirsch,et al.  Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games , 1999 .

[8]  I. Argyros A generalization of Ostrowski's theorem on fixed points , 1999 .

[9]  J. Hofbauer,et al.  Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[10]  V. Borkar Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .

[11]  João Pedro Hespanha,et al.  Linear Systems Theory , 2009 .

[12]  Shalabh Bhatnagar,et al.  Stochastic Recursive Algorithms for Optimization , 2012 .

[13]  Cars H. Hommes,et al.  Multiple equilibria and limit cycles in evolutionary games with Logit Dynamics , 2012, Games Econ. Behav..

[14]  Josef Hofbauer,et al.  Perturbations of Set-Valued Dynamical Systems, with Applications to Game Theory , 2012, Dyn. Games Appl..

[15]  S. Shankar Sastry,et al.  Characterization and computation of local Nash equilibria in continuous games , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[16]  P. Olver Nonlinear Systems , 2013 .

[17]  S. Shankar Sastry,et al.  Genericity and structural stability of non-degenerate differential Nash equilibria , 2014, 2014 American Control Conference.

[18]  S. Bhatnagar,et al.  Two Timescale Stochastic Approximation with Controlled Markov noise , 2015, ArXiv.

[19]  S. Shankar Sastry,et al.  On the Characterization of Local Nash Equilibria in Continuous Games , 2014, IEEE Transactions on Automatic Control.

[20]  David Silver,et al.  Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.

[21]  Vivek S. Borkar,et al.  Concentration bounds for two time scale stochastic approximation , 2018, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Shane Legg,et al.  Symmetric Decomposition of Asymmetric Games , 2017, Scientific Reports.

[23]  Lillian J. Ratliff,et al.  On the Convergence of Competitive, Multi-Agent Gradient-Based Learning , 2018, ArXiv.

[24]  Georgios Piliouras,et al.  Game dynamics as the meaning of a game , 2019, SECO.

[25]  V. Borkar,et al.  A Concentration Bound for Stochastic Approximation via Alekseev’s Formula , 2015, Stochastic Systems.

[26]  Zhengyuan Zhou,et al.  Learning in games with continuous action sets and unknown payoff functions , 2019, Math. Program..

[27]  S. Shankar Sastry,et al.  On Finding Local Nash Equilibria (and Only Local Nash Equilibria) in Zero-Sum Games , 2019, 1901.00838.

[28]  Lillian J. Ratliff,et al.  Adaptive Incentive Design , 2018, IEEE Transactions on Automatic Control.