论文信息 - Coordinating Teams in Uncertain Environments: A Hybrid BDI-POMDP Approach

Coordinating Teams in Uncertain Environments: A Hybrid BDI-POMDP Approach

Distributed partially observable Markov decision problems (POMDPs) have emerged as a popular decision-theoretic approach for planning for multiagent teams, where it is imperative for the agents to be able to reason about the rewards (and costs) for their actions in the presence of uncertainty. However, finding the optimal distributed POMDP policy is computationally intractable (NEXP-Complete). This paper is focussed on a principled way to combine the two dominant paradigms for building multiagent team plans, namely the “belief-desire-intention” (BDI) approach and distributed POMDPs. In this hybrid BDI-POMDP approach, BDI team plans are exploited to improve distributed POMDP tractability and distributed POMDP-based analysis improves BDI team plan performance. Concretely, we focus on role allocation, a fundamental problem in BDI teams – which agents to allocate to the different roles in the team. The hybrid BDI-POMDP approach provides three key contributions. First, unlike prior work in multiagent role allocation, we describe a role allocation technique that takes into account future uncertainties in the domain. The second contribution is a novel decomposition technique, which exploits the structure in the BDI team plans to significantly prune the search space of combinatorially many role allocations. Our third key contribution is a significantly faster policy evaluation algorithm suited for our BDI-POMDP hybrid approach. Finally, we also present experimental results from two domains: mission rehearsal simulation and RoboCupRescue disaster rescue simulation. In the RoboCupRescue domain, we show that the role allocation technique presented in this paper is capable of performing at human expert levels by comparing with the allocations chosen by humans in the actual RoboCupRescue simulation environment.

Milind Tambe | Ranjit Nair | Milind Tambe | Ranjit R. Nair

[1] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[2] Victor R. Lesser,et al. Multi-agent policies: from centralized ones to decentralized ones , 2002, AAMAS '02.

[3] Victor R. Lesser,et al. Using self-diagnosis to adapt organizational structures , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[4] Makoto Yokoo,et al. An asynchronous complete method for distributed constraint optimization , 2003, AAMAS '03.

[5] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[6] Manuela M. Veloso,et al. Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[7] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[8] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[9] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[10] Milind Tambe,et al. Role allocation and reallocation in multiagent teams: towards a practical analysis , 2003, AAMAS '03.

[11] Shobha Venkataraman,et al. Context-specific multiagent coordination and planning with factored MDPs , 2002, AAAI/IAAI.

[12] Makoto Yokoo,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[13] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[14] Yves Demazeau,et al. Vowels co-ordination model , 2002, AAMAS '02.

[15] K. Suzanne Barber,et al. Dynamic reorganization of decision-making groups , 2001, AGENTS '01.

[16] M.P. Georgeff,et al. Procedural knowledge , 1986, Proceedings of the IEEE.

[17] G. Tidhar,et al. Guided Team Selection * , 1996 .

[18] Candace L. Sidner,et al. COLLAGEN: when agents collaborate with people , 1997, AGENTS '97.

[19] Milind Tambe,et al. Building Dynamic Agent Organizations in Cyberspace , 2000, IEEE Internet Comput..

[20] Michael Wooldridge,et al. Adaptive Task and Resource Allocation in Multi-Agent Systems , 2001 .

[21] Luke Hunsberger,et al. A combinatorial auction for collaborative planning , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[22] Michael Wooldridge,et al. Introduction to multiagent systems , 2001 .

[23] Makoto Yokoo,et al. Communications for improving policy computation in distributed POMDPs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[24] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.

[25] Sarit Kraus,et al. Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[26] Barbara Dunin-Keplicz,et al. A reconfiguration algorithm for distributed problem solving , 2001 .

[27] Victor R. Lesser,et al. Solving distributed constraint optimization problems using cooperative mediation , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[28] Barbara Messing,et al. An Introduction to MultiAgent Systems , 2002, Künstliche Intell..

[29] Hector J. Levesque,et al. On Acting Together , 1990, AAAI.

[30] François Charpillet,et al. A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem , 2002, SAC '02.

[31] Nicholas R. Jennings,et al. Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions , 1995, Artif. Intell..

[32] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.

[33] Milind Tambe,et al. An Automated Teamwork Infrastructure for Heterogeneous Software Agents and Humans , 2003, Autonomous Agents and Multi-Agent Systems.

[34] John Yen,et al. CAST: Collaborative Agents for Simulating Teamwork , 2001, IJCAI.

[35] Michael Wooldridge,et al. Adaptive task resources allocation in multi-agent systems , 2001, AGENTS '01.

[36] Sarit Kraus,et al. Planning and Acting Together , 1999, AI Mag..

[37] Victor R. Lesser,et al. Communication decisions in multi-agent cooperation: model and experiments , 2001, AGENTS '01.

[38] Claudia V. Goldman,et al. Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[39] Milind Tambe,et al. Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[40] Sarit Kraus,et al. Methods for Task Allocation via Agent Coalition Formation , 1998, Artif. Intell..

[41] Victor R. Lesser,et al. Quantitative Modeling of Complex Computational Task Environments , 1993, AAAI.

[42] Milind Tambe,et al. A prototype infrastructure for distributed robot-agent-person teams , 2003, AAMAS '03.

[43] Takayuki Ito,et al. Task Allocation in the RoboCup Rescue Simulation Domain: A Short Note , 2001, RoboCup.