论文信息 - Large-Scale Multi-Agent Deep FBSDEs

Large-Scale Multi-Agent Deep FBSDEs

In this paper we present a scalable deep learning framework for finding Markovian Nash Equilibria in multi-agent stochastic games using fictitious play. The motivation is inspired by theoretical analysis of Forward Backward Stochastic Differential Equations (FBSDE) and their implementation in a deep learning setting, which is the source of our algorithm’s sample efficiency improvement. By taking advantage of the permutation-invariant property of agents in symmetric games, the scalability and performance is further enhanced significantly. We showcase superior performance of our framework over the state-of-the-art deep fictitious play algorithm on an inter-bank lending/borrowing problem in terms of multiple metrics. More importantly, our approach scales up to 3000 agents in simulation, a scale which, to the best of our knowledge, represents a new stateof-the-art. We also demonstrate the applicability of our framework in robotics on a belief space autonomous racing problem.

Ziyi Wang | Ioannis Exarchos | Evangelos A. Theodorou | Tianrong Chen

[1] Evangelos Theodorou,et al. Stochastic optimal control via forward and backward stochastic differential equations and importance sampling , 2018, Autom..

[2] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[3] Panagiotis Tsiotras,et al. Stochastic Differential Games: A Sampling Approach via FBSDEs , 2018, Dynamic Games and Applications.

[4] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.

[5] Constantin Greif,et al. Numerical Methods for Hamilton-Jacobi-Bellman Equations , 2017 .

[6] R. Carmona,et al. Mean Field Games and Systemic Risk , 2013, 1308.2172.

[7] A. Harry Klopf,et al. Reinforcement Learning Applied to a Differential Game , 1995, Adapt. Behav..

[8] Pascal Frossard,et al. Adaptive data augmentation for image classification , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[9] Jiequn Han,et al. Convergence of the deep BSDE method for coupled FBSDEs , 2018, Probability, Uncertainty and Quantitative Risk.

[10] O. H. Brownlee,et al. ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .

[11] Quoc V. Le,et al. AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[12] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[13] Juan Li,et al. Stochastic Differential Games and Viscosity Solutions of Hamilton--Jacobi--Bellman--Isaacs Equations , 2008, SIAM J. Control. Optim..

[14] Ruimeng Hu,et al. Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games , 2020, MSML.

[15] Christian Bender,et al. Importance Sampling for Backward SDEs , 2010 .

[16] Ruimeng Hu,et al. Deep Fictitious Play for Stochastic Differential Games , 2019, Communications in Mathematical Sciences.

[17] Tyrone E. Duncan,et al. Some stochastic differential games with state dependent noise , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).