Large-Scale Multi-Agent Deep FBSDEs

In this paper we present a scalable deep learning framework for finding Markovian Nash Equilibria in multi-agent stochastic games using fictitious play. The motivation is inspired by theoretical analysis of Forward Backward Stochastic Differential Equations (FBSDE) and their implementation in a deep learning setting, which is the source of our algorithm’s sample efficiency improvement. By taking advantage of the permutation-invariant property of agents in symmetric games, the scalability and performance is further enhanced significantly. We showcase superior performance of our framework over the state-of-the-art deep fictitious play algorithm on an inter-bank lending/borrowing problem in terms of multiple metrics. More importantly, our approach scales up to 3000 agents in simulation, a scale which, to the best of our knowledge, represents a new stateof-the-art. We also demonstrate the applicability of our framework in robotics on a belief space autonomous racing problem.

[1]  Evangelos Theodorou,et al.  Stochastic optimal control via forward and backward stochastic differential equations and importance sampling , 2018, Autom..

[2]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[3]  Panagiotis Tsiotras,et al.  Stochastic Differential Games: A Sampling Approach via FBSDEs , 2018, Dynamic Games and Applications.

[4]  Shane Legg,et al.  Noisy Networks for Exploration , 2017, ICLR.

[5]  Constantin Greif,et al.  Numerical Methods for Hamilton-Jacobi-Bellman Equations , 2017 .

[6]  R. Carmona,et al.  Mean Field Games and Systemic Risk , 2013, 1308.2172.

[7]  A. Harry Klopf,et al.  Reinforcement Learning Applied to a Differential Game , 1995, Adapt. Behav..

[8]  Pascal Frossard,et al.  Adaptive data augmentation for image classification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[9]  Jiequn Han,et al.  Convergence of the deep BSDE method for coupled FBSDEs , 2018, Probability, Uncertainty and Quantitative Risk.

[10]  O. H. Brownlee,et al.  ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .

[11]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[12]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[13]  Juan Li,et al.  Stochastic Differential Games and Viscosity Solutions of Hamilton--Jacobi--Bellman--Isaacs Equations , 2008, SIAM J. Control. Optim..

[14]  Ruimeng Hu,et al.  Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games , 2020, MSML.

[15]  Christian Bender,et al.  Importance Sampling for Backward SDEs , 2010 .

[16]  Ruimeng Hu,et al.  Deep Fictitious Play for Stochastic Differential Games , 2019, Communications in Mathematical Sciences.

[17]  Tyrone E. Duncan,et al.  Some stochastic differential games with state dependent noise , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).