论文信息 - Teamwork in distributed POMDPs: execution-time coordination under model uncertainty

Teamwork in distributed POMDPs: execution-time coordination under model uncertainty

Despite their NEXP-complete policy generation complexity [1],Distributed Partially Observable Markov Decision Problems(DEC-POMDPs) have become a popular paradigm for multiagentteamwork [2, 6, 8]. DEC-POMDPs are able to quantitatively ex-press observational and action uncertainty, and yet optimally plancommunications and domain actions.This paper focuses on teamwork under model uncertainty (i.e.,potentially inaccurate transition and observation functions) inDEC-POMDPs. In many domains, we only have an approximatemodel of agent observation or transition functions. To address thischallenge we rely on execution-centric frameworks [7, 11, 12],which simplify planning in DEC-POMDPs (e.g., by assuming cost-free communication at plan-time), and shift coordination reasoningto execution time. Speciﬁcally, during planning, these frameworkshave a standard single-agent POMDP planner [4] to plan a pol-icy for the team of agents by assuming zero-cost communication.Then, at execution-time, agents model other agentsa˛´r beliefs andactions, reason about when to communicate with teammates, rea-son about what action to take if not communicating, etc. Unfortu-nately, past work in execution-centric approaches [7, 11, 12] alsoassumes a correct world model, and the presence of model uncer-tainty exposes key weaknesses that result in erroneous plans andadditional inefﬁciency due to reasoning over incorrect world mod-els at every decision epoch.This paper provides two sets of contributions. The ﬁrst is anew execution-centric framework for DEC-POMDPs called MOD-ERN (MOdel uncertainty in Dec-pomdp Execution-time ReasoN-ing). MODERN is the ﬁrst execution-centric framework for DEC-POMDPs explicitly motivated by model uncertainty. It is based onCiteas:

[1] Claudia V. Goldman,et al. Optimizing information exchange in cooperative multi-agent systems , 2003, AAMAS '03.

[2] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[3] Shlomo Zilberstein,et al. Formal models and algorithms for decentralized decision making under uncertainty , 2008, Autonomous Agents and Multi-Agent Systems.

[4] Manuela M. Veloso,et al. Reasoning about joint beliefs for execution-time communication decisions , 2005, AAMAS '05.

[5] Feng Wu,et al. Multi-Agent Online Planning with Communication , 2009, ICAPS.

[6] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[7] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[8] Victor R. Lesser,et al. Multi-agent policies: from centralized ones to decentralized ones , 2002, AAMAS '02.

[9] Joelle Pineau,et al. A formal framework for robot learning and control under model uncertainty , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[10] Nicholas R. Jennings,et al. Reward shaping for valuing communications during multi-agent coordination , 2009, AAMAS.

[11] Hector J. Levesque,et al. On Acting Together , 1990, AAAI.

[12] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..