A Soft Graph Attention Reinforcement Learning for Multi-Agent Cooperation

The multi-agent reinforcement learning (MARL) suffers from several issues when it is applied to large-scale environments. Specifically, the communication among the agents is limited by the communication distance or bandwidth. Besides, the interactions among the agents are complex in large-scale environments, which makes each agent hard to take different influences of the other agents into consideration and to learn a stable policy. To address these issues, a soft graph attention reinforcement learning (SGA-RL) is proposed. By taking the advantage of the chain propagation characteristics of graph neural networks, stacked graph convolution layers can overcome the limitation of the communication and enlarge the agents’ receptive field to promote the cooperation behavior among the agents. Moreover, unlike traditional multi-head attention mechanism which takes all the heads into consideration equally, a soft attention mechanism is designed to learn each attention head’s importance, which means that each agent can learn how to treat the other agents’ influence more effectively during large-scale environments. The results of the simulations indicate that the agents can learn stable and complicated cooperative strategies with SGA-RL in large-scale environments.

[1]  Yuval Tassa,et al.  Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[2]  Ming Zhou,et al.  Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.

[3]  Tie-Yan Liu,et al.  A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network , 2019, AAMAS.

[4]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[5]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[6]  Sumit Kumar,et al.  Learning Transferable Cooperative Behavior in Multi-Agent Teams , 2019, AAMAS.

[7]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[8]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[9]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[10]  Zongqing Lu,et al.  Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.

[11]  Joelle Pineau,et al.  TarMAC: Targeted Multi-Agent Communication , 2018, ICML.

[12]  Daniel Kudenko,et al.  Deep Multi-Agent Reinforcement Learning with Relevance Graphs , 2018, ArXiv.

[13]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[14]  Zongqing Lu,et al.  Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation , 2018, ArXiv.

[15]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[16]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[17]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[18]  Dipti Srinivasan,et al.  A multi-agent based distributed energy management scheme for smart grid applications , 2016 .

[19]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[20]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[21]  Fei Sha,et al.  Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.