论文信息 - ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind - 字舞流文

ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind

Being able to predict the mental states of others is a key factor to effective social interaction. It is also crucial for distributed multi-agent systems, where agents are required to communicate and cooperate. In this paper, we introduce such an important social-cognitive skill, i.e. Theory of Mind (ToM), to build socially intelligent agents who are able to communicate and cooperate effectively to accomplish challenging tasks. With ToM, each agent is capable of inferring the mental states and intentions of others according to its (local) observation. Based on the inferred states, the agents decide"when"and with"whom"to share their intentions. With the information observed, inferred, and received, the agents decide their sub-goals and reach a consensus among the team. In the end, the low-level executors independently take primitive actions to accomplish the sub-goals. We demonstrate the idea in two typical target-oriented multi-agent tasks: cooperative navigation and multi-sensor target coverage. The experiments show that the proposed model not only outperforms the state-of-the-art methods on reward and communication efficiency, but also shows good generalization across different scales of the environment.

Yizhou Wang | Yuan-fang Wang | Fangwei Zhong | Jing Xu | Yuan-Fang Wang | Jing Xu

[1] Wenhan Luo,et al. Towards Distraction-Robust Active Visual Tracking , 2021, ICML.

[2] A. Bayen,et al. The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games , 2021, NeurIPS.

[3] Joshua B. Tenenbaum,et al. PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception , 2021, AAAI.

[4] Chuang Gan,et al. AGENT: A Benchmark for Core Psychological Reasoning , 2021, ICML.

[5] B. Lake,et al. Baby Intuitions Benchmark (BIB): Discerning the goals, preferences, and actions of others , 2021, NeurIPS.

[6] Jing Xu,et al. Learning Multi-Agent Coordination for Enhancing Target Coverage in Directional Sensor Networks , 2020, NeurIPS.

[7] J. Tenenbaum,et al. Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration , 2020, ICLR.

[8] Desmond C. Ong,et al. Improving Multi-Agent Cooperation using Theory of Mind , 2020, CogSci.

[9] Dieter Fox,et al. Causal Discovery in Physical Systems from Videos , 2020, NeurIPS.

[10] Tiejun Huang,et al. Learning Individually Inferred Communication for Multi-Agent Cooperation , 2020, NeurIPS.

[11] James A. Evans,et al. Too Many Cooks: Bayesian Inference for Coordinating Multi-Agent Collaboration , 2020, Top. Cogn. Sci..

[12] Shimon Whiteson,et al. Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2020, J. Mach. Learn. Res..

[13] Jing Xu,et al. Pose-Assisted Multi-Camera Collaboration for Active Object Tracking , 2020, AAAI.

[14] H. Zha,et al. Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery , 2019, AAMAS.

[15] Wenhan Luo,et al. AD-VAT+: An Asymmetric Dueling Mechanism for Learning and Understanding Visual Active Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Anca D. Dragan,et al. On the Utility of Learning about Humans for Human-AI Coordination , 2019, NeurIPS.

[17] Jonathan P. How,et al. Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning , 2019, ArXiv.

[18] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.

[19] H. Francis Song,et al. The Hanabi Challenge: A New Frontier for AI Research , 2019, Artif. Intell..

[20] Joshua B. Tenenbaum,et al. Theory of Minds: Understanding Behavior in Groups Through Inverse Planning , 2019, AAAI.

[21] Tiejun Huang,et al. Graph Convolutional Reinforcement Learning , 2018, ICLR.

[22] Joelle Pineau,et al. TarMAC: Targeted Multi-Agent Communication , 2018, ICML.

[23] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.

[24] Stefan Kopp,et al. Satisficing Models of Bayesian Theory of Mind for Explaining Behavior of Differently Uncertain Agents: Socially Interactive Agents Track , 2018, AAMAS.

[25] Maruan Al-Shedivat,et al. Learning Policy Representations in Multiagent Systems , 2018, ICML.

[26] Elio Tuci,et al. Cooperative Object Transport in Multi-Robot Systems: A Review of the State-of-the-Art , 2018, Front. Robot. AI.

[27] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[28] Rob Fergus,et al. Modeling Others using Oneself in Multi-Agent Reinforcement Learning , 2018, ICML.

[29] H. Francis Song,et al. Machine Theory of Mind , 2018, ICML.

[30] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[31] Joel Z. Leibo,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning , 2017, ArXiv.

[32] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[33] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[34] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[35] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[36] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[37] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[38] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.

[39] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[40] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[41] Alan G. Sanfey,et al. Predicting the other in cooperative interactions , 2015, Trends in Cognitive Sciences.

[42] A. Rustichini,et al. Children’s strategic theory of mind , 2014, Proceedings of the National Academy of Sciences.

[43] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[44] M. Tomasello. A Natural History of Human Thinking , 2014 .

[45] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[46] D. Premack,et al. Does the chimpanzee have a theory of mind? , 1978, Behavioral and Brain Sciences.

[47] V. Slaughter,et al. Theory of mind and peer cooperation in two play contexts , 2019, Journal of Applied Developmental Psychology.