论文信息 - Learning Individually Inferred Communication for Multi-Agent Cooperation

Learning Individually Inferred Communication for Multi-Agent Cooperation

Communication lays the foundation for human cooperation. It is also crucial for multi-agent cooperation. However, existing work focuses on broadcast communication, which is not only impractical but also leads to information redundancy that could even impair the learning process. To tackle these difficulties, we propose \textit{Individually Inferred Communication} (I2C), a simple yet effective model to enable agents to learn a prior for agent-agent communication. The prior knowledge is learned via causal inference and realized by a feed-forward neural network that maps the agent's local observation to a belief about who to communicate with. The influence of one agent on another is inferred via the joint action-value function in multi-agent reinforcement learning and quantified to label the necessity of agent-agent communication. Furthermore, the agent policy is regularized to better exploit communicated messages. Empirically, we show that I2C can not only reduce communication overhead but also improve the performance in a variety of multi-agent cooperative scenarios, comparing to existing methods.

Zongqing Lu | Tiejun Huang | Ziluo Ding

[1] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[2] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[3] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[4] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[5] Tiejun Huang,et al. Graph Convolutional Reinforcement Learning , 2020, ICLR.

[6] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[7] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[8] Taeyoung Lee,et al. Learning to Schedule Communication in Multi-agent Reinforcement Learning , 2019, ICLR.

[9] Nan Xu,et al. CoLight: Learning Network-level Cooperation for Traffic Signal Control , 2019, CIKM.

[10] Goran Strbac,et al. Recurrent Deep Multiagent Q-Learning for Autonomous Brokers in Smart Grid , 2018, IJCAI.

[11] Ernst Fehr,et al. Normative foundations of human cooperation , 2018, Nature Human Behaviour.