论文信息 - Convergence of Multi-Agent Learning with a Finite Step Size in General-Sum Games

Convergence of Multi-Agent Learning with a Finite Step Size in General-Sum Games

Learning in a multi-agent system is challenging because agents are simultaneously learning and the environment is not stationary, undermining convergence guarantees. To address this challenge, this paper presents a new gradient-based learning algorithm, called Gradient Ascent with Shrinking Policy Prediction (GA-SPP), which augments the basic gradient ascent approach with the concept of shrinking policy prediction. The key idea behind this algorithm is that an agent adjusts its strategy in response to the forecasted strategy of the other agent, instead of its current one. GA-SPP is shown formally to have Nash convergence in larger settings than existing gradient-based multi-agent learning methods. Furthermore, unlike existing gradient-based methods, GA-SPP's theoretical guarantees do not assume the learning rate to be infinitesimal.

Chongjie Zhang | Tonghan Wang | Xinliang Song

[1] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[2] Victor R. Lesser,et al. A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics , 2008, J. Artif. Intell. Res..

[3] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[4] Yoav Shoham,et al. Simple search methods for finding a Nash equilibrium , 2004, Games Econ. Behav..

[5] Victor R. Lesser,et al. Multi-Agent Learning with Policy Prediction , 2010, AAAI.

[6] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.

[7] C. E. Lemke,et al. Equilibrium Points of Bimatrix Games , 1964 .

[8] Jacob W. Crandall,et al. Towards Minimizing Disappointment in Repeated Games , 2014, J. Artif. Intell. Res..

[9] Branislav Bosanský,et al. Algorithms for computing strategies in two-player simultaneous move games , 2016, Artif. Intell..

[10] R. Enkhbat,et al. Extragradient approach to solution of two person non-zero sum games , 2003 .

[11] 김여근. 쌍행렬게임의 평형점 ( Equilibrium Points of Bimatrix Games : A State-of-the-Art ) , 1982 .