What to Communicate? Execution-Time Decision in Multi-agent POMDPs

In recent years, multi-agent Partially Observable Markov Decision Processes (POMDP) have emerged as a popular decision-theoretic framework for modeling and generating policies for the control of multi-agent teams. Teams controlled by multi-agent POMDPs can use communication to share observations and coordinate. Therefore, policies are needed to enable these teams to reason about communication. Previous work on generating communication policies for multi-agent POMDPs has focused on the question of when to communicate. In this paper, we address the question of what to communicate. We describe two paradigms for representing limitations on communication and present an algorithm that enables multi-agent teams to make execution-time decisions on how to effectively utilize available communication resources.

[1]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[2]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[3]  Makoto Yokoo,et al.  Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[4]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[5]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  Kul B. Bhasin,et al.  Advanced Communication and Networking Technologies for Mars Exploration , 2001 .

[7]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[8]  Claudia V. Goldman,et al.  Optimizing information exchange in cooperative multi-agent systems , 2003, AAMAS '03.

[9]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[10]  Manuela M. Veloso,et al.  Reasoning about joint beliefs for execution-time communication decisions , 2005, AAMAS '05.

[11]  Victor Lesser,et al.  Formal Modeling of Communication Decisions in Cooperative Multi-agent Systems , 2004 .

[12]  Milind Tambe,et al.  The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[13]  Manuela M. Veloso,et al.  A real-time world model for multi-robot teams with high-latency communication , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[14]  Sebastian Thrun,et al.  Decentralized Sensor Fusion with Distributed Particle Filters , 2002, UAI.

[15]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[16]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.

[17]  Prashant Doshi,et al.  Approximating state estimation in multiagent settings using particle filters , 2005, AAMAS '05.