论文信息 - Teamwork and Coordination under Model Uncertainty in DEC-POMDPs

Teamwork and Coordination under Model Uncertainty in DEC-POMDPs

Distributed Partially Observable Markov Decision Processes (DEC-POMDPs) are a popular planning framework for multiagent teamwork to compute (near-) optimal plans. However, these methods assume a complete and correct world model, which is often violated in real-world domains. We provide a new algorithm for DEC-POMDPs that is more robust to model uncertainty, with a focus on domains with sparse agent interactions. Our STC algorithm relies on the following key ideas: (1) reduce planning-time computation by shifting some of the burden to execution-time reasoning, (2) exploit sparse interactions between agents, and (3) maintain an approximate model of agents' beliefs. We empirically show that STC is often substantially faster to existing DEC-POMDP methods without sacrificing reward performance.

[1] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[2] Shlomo Zilberstein,et al. Memory-Bounded Dynamic Programming for DEC-POMDPs , 2007, IJCAI.

[3] Nikos A. Vlassis,et al. Multiagent Planning Under Uncertainty with Stochastic Communication Delays , 2008, ICAPS.

[4] Sarit Kraus,et al. Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[5] Joelle Pineau,et al. A formal framework for robot learning and control under model uncertainty , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[6] Feng Wu,et al. Multi-Agent Online Planning with Communication , 2009, ICAPS.

[7] Jeff G. Schneider,et al. Game Theoretic Control for Robot Teams , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[8] Makoto Yokoo,et al. Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[9] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[10] Claudia V. Goldman,et al. Optimizing information exchange in cooperative multi-agent systems , 2003, AAMAS '03.

[11] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[12] Manuela M. Veloso,et al. Reasoning about joint beliefs for execution-time communication decisions , 2005, AAMAS '05.

[13] Gal A. Kaminka,et al. Integration of Coordination Mechanisms in the BITE Multi-Robot Architecture , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[14] Victor R. Lesser,et al. Multi-agent policies: from centralized ones to decentralized ones , 2002, AAMAS '02.

[15] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[16] Hector J. Levesque,et al. On Acting Together , 1990, AAAI.