ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind

Being able to predict the mental states of others is a key factor to effective social interaction. It is also crucial for distributed multi-agent systems, where agents are required to communicate and cooperate. In this paper, we introduce such an important social-cognitive skill, i.e. Theory of Mind (ToM), to build socially intelligent agents who are able to communicate and cooperate effectively to accomplish challenging tasks. With ToM, each agent is capable of inferring the mental states and intentions of others according to its (local) observation. Based on the inferred states, the agents decide “when” and with “whom” to share their intentions. With the information observed, inferred, and received, the agents decide their sub-goals and reach a consensus among the team. In the end, the low-level executors independently take primitive actions to accomplish the sub-goals. We demonstrate the idea in two typical target-oriented multi-agent tasks: cooperative navigation and multisensor target coverage. The experiments show that the proposed model not only outperforms the state-of-the-art methods on reward and communication efficiency, but also shows good generalization across different scales of the environment.

[1]  Jing Xu,et al.  Learning Multi-Agent Coordination for Enhancing Target Coverage in Directional Sensor Networks , 2020, NeurIPS.

[2]  Alan G. Sanfey,et al.  Predicting the other in cooperative interactions , 2015, Trends in Cognitive Sciences.

[3]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[4]  Desmond C. Ong,et al.  Improving Multi-Agent Cooperation using Theory of Mind , 2020, CogSci.

[5]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[6]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[7]  Maruan Al-Shedivat,et al.  Learning Policy Representations in Multiagent Systems , 2018, ICML.

[8]  Guy Lever,et al.  Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[9]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[10]  Rob Fergus,et al.  Modeling Others using Oneself in Multi-Agent Reinforcement Learning , 2018, ICML.

[11]  Stefan Kopp,et al.  Satisficing Models of Bayesian Theory of Mind for Explaining Behavior of Differently Uncertain Agents: Socially Interactive Agents Track , 2018, AAMAS.

[12]  Joelle Pineau,et al.  TarMAC: Targeted Multi-Agent Communication , 2018, ICML.

[13]  Fei Sha,et al.  Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.

[14]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[15]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[16]  Jing Xu,et al.  Pose-Assisted Multi-Camera Collaboration for Active Object Tracking , 2020, AAAI.

[17]  Shimon Whiteson,et al.  QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[18]  H. Francis Song,et al.  Machine Theory of Mind , 2018, ICML.

[19]  Yu Wang,et al.  The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games , 2021, NeurIPS.

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[22]  Dieter Fox,et al.  Causal Discovery in Physical Systems from Videos , 2020, NeurIPS.

[23]  V. Slaughter,et al.  Theory of mind and peer cooperation in two play contexts , 2019, Journal of Applied Developmental Psychology.

[24]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[25]  Zongqing Lu,et al.  Learning Individually Inferred Communication for Multi-Agent Cooperation , 2020, NeurIPS.

[26]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[27]  Tiejun Huang,et al.  Graph Convolutional Reinforcement Learning , 2020, ICLR.

[28]  Adam See,et al.  Does the Chimpanzee Have Theory of Mind? , 2021, Encyclopedia of Evolutionary Psychological Science.

[29]  M. Tomasello A Natural History of Human Thinking , 2014 .

[30]  Joshua B. Tenenbaum,et al.  Too Many Cooks: Bayesian Inference for Coordinating Multi-Agent Collaboration , 2020, Top. Cogn. Sci..

[31]  Joshua B. Tenenbaum,et al.  Theory of Minds: Understanding Behavior in Groups Through Inverse Planning , 2019, AAAI.

[32]  Jordan L. Boyd-Graber,et al.  Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.

[33]  A. Rustichini,et al.  Children’s strategic theory of mind , 2014, Proceedings of the National Academy of Sciences.