Hybrid BDI-POMDP Framework for Multiagent Teaming

Many current large-scale multiagent team implementations can be characterized as following the "belief-desire-intention" (BDI) paradigm, with explicit representation of team plans. Despite their promise, current BDI team approaches lack tools for quantitative performance analysis under uncertainty. Distributed partially observable Markov decision problems (POMDPs) are well suited for such analysis, but the complexity of finding optimal policies in such models is highly intractable. The key contribution of this article is a hybrid BDI-POMDP approach, where BDI team plans are exploited to improve POMDP tractability and POMDP analysis improves BDI team plan performance. Concretely, we focus on role allocation, a fundamental problem in BDI teams: which agents to allocate to the different roles in the team. The article provides three key contributions. First, we describe a role allocation technique that takes into account future uncertainties in the domain; prior work in multiagent role allocation has failed to address such uncertainties. To that end, we introduce RMTDP (Role-based Markov Team Decision Problem), a new distributed POMDP model for analysis of role allocations. Our technique gains in tractability by significantly curtailing RMTDP policy search; in particular, BDI team plans provide incomplete RMTDP policies, and the RMTDP policy search fills the gaps in such incomplete policies by searching for the best role allocation. Our second key contribution is a novel decomposition technique to further improve RMTDP policy search efficiency. Even though limited to searching role allocations, there are still combinatorially many role allocations, and evaluating each in RMTDP to identify the best is extremely difficult. Our decomposition technique exploits the structure in the BDI team plans to significantly prune the search space of role allocations. Our third key contribution is a significantly faster policy evaluation algorithm suited for our BDI-POMDP hybrid approach. Finally, we also present experimental results from two domains: mission rehearsal simulation and RoboCupRescue disaster rescue simulation.

[1]  Victor R. Lesser,et al.  Multi-agent policies: from centralized ones to decentralized ones , 2002, AAMAS '02.

[2]  R. Radner,et al.  Economic theory of teams , 1972 .

[3]  Tsuneo Yoshikawa,et al.  Decomposition of Dynamic Team Decision Problems , 1977 .

[4]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[5]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[6]  Sarit Kraus,et al.  Methods for Task Allocation via Agent Coalition Formation , 1998, Artif. Intell..

[7]  Claudia V. Goldman,et al.  Optimizing information exchange in cooperative multi-agent systems , 2003, AAMAS '03.

[8]  Manuela M. Veloso,et al.  Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[9]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[10]  Michael Wooldridge,et al.  Reasoning about Intentions in Uncertain Domains , 2001, ECSQARU.

[11]  Hector J. Levesque,et al.  On Acting Together , 1990, AAAI.

[12]  François Charpillet,et al.  A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem , 2002, SAC '02.

[13]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[14]  Nicholas R. Jennings,et al.  Controlling Cooperative Problem Solving in Industrial Multi-Agent Systems Using Joint Intentions , 1995, Artif. Intell..

[15]  Luke Hunsberger,et al.  A combinatorial auction for collaborative planning , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[16]  Craig Boutilier,et al.  Decision-Theoretic, High-Level Agent Programming in the Situation Calculus , 2000, AAAI/IAAI.

[17]  Gil A. Tidhar Team-Oriented Programming: Preliminary Report , 1993 .

[18]  Yu-Chi Ho Team decision theory and information structures , 1980, Proceedings of the IEEE.

[19]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.

[20]  Claudia V. Goldman,et al.  Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[21]  Michael Wooldridge,et al.  Adaptive Task and Resource Allocation in Multi-Agent Systems , 2001 .

[22]  Eric A. Hansen,et al.  Synthesis of Hierarchical Finite-State Controllers for POMDPs , 2003, ICAPS.

[23]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[24]  Sarit Kraus,et al.  Collaborative Plans for Complex Group Action , 1996, Artif. Intell..

[25]  Barbara Dunin-Keplicz,et al.  A reconfiguration algorithm for distributed problem solving , 2001 .

[26]  Milind Tambe,et al.  Towards Flexible Teamwork , 1997, J. Artif. Intell. Res..

[27]  George E. Monahan,et al.  A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .

[28]  G. Tidhar,et al.  Guided Team Selection * , 1996 .

[29]  Candace L. Sidner,et al.  COLLAGEN: when agents collaborate with people , 1997, AGENTS '97.

[30]  Milind Tambe,et al.  Building Dynamic Agent Organizations in Cyberspace , 2000, IEEE Internet Comput..

[31]  Milind Tambe,et al.  An Automated Teamwork Infrastructure for Heterogeneous Software Agents and Humans , 2003, Autonomous Agents and Multi-Agent Systems.

[32]  John Yen,et al.  CAST: Collaborative Agents for Simulating Teamwork , 2001, IJCAI.

[33]  Makoto Yokoo,et al.  An asynchronous complete method for distributed constraint optimization , 2003, AAMAS '03.

[34]  Shobha Venkataraman,et al.  Context-specific multiagent coordination and planning with factored MDPs , 2002, AAAI/IAAI.

[35]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[36]  Yves Demazeau,et al.  Vowels co-ordination model , 2002, AAMAS '02.

[37]  K. Suzanne Barber,et al.  Dynamic reorganization of decision-making groups , 2001, AGENTS '01.

[38]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[39]  Milind Tambe,et al.  Towards Adjustable Autonomy for the Real World , 2002, J. Artif. Intell. Res..

[40]  Thomas Dean,et al.  Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.

[41]  Barbara Messing,et al.  An Introduction to MultiAgent Systems , 2002, Künstliche Intell..

[42]  Victor R. Lesser,et al.  Solving distributed constraint optimization problems using cooperative mediation , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[43]  Michael Wooldridge,et al.  Adaptive task resources allocation in multi-agent systems , 2001, AGENTS '01.

[44]  Hector Muñoz-Avila,et al.  IMPACTing SHOP: Putting an AI Planner Into a Multi-Agent Environment , 2003, Annals of Mathematics and Artificial Intelligence.

[45]  Sarit Kraus,et al.  Planning and Acting Together , 1999, AI Mag..

[46]  Milind Tambe,et al.  Automated Assistants for Analyzing Team Behaviors , 2004, Autonomous Agents and Multi-Agent Systems.

[47]  Michael L. Littman,et al.  Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[48]  Takayuki Ito,et al.  Task Allocation in the RoboCup Rescue Simulation Domain: A Short Note , 2001, RoboCup.

[49]  Milind Tambe,et al.  Team Formation for Reformation in Multiagent Domains Like RoboCupRescue , 2002, RoboCup.

[50]  M.P. Georgeff,et al.  Procedural knowledge , 1986, Proceedings of the IEEE.

[51]  Victor R. Lesser,et al.  Communication decisions in multi-agent cooperation: model and experiments , 2001, AGENTS '01.

[52]  Milind Tambe,et al.  The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[53]  James A. Hendler,et al.  HTN Planning: Complexity and Expressivity , 1994, AAAI.

[54]  Michael Wooldridge,et al.  Introduction to multiagent systems , 2001 .

[55]  Craig Boutilier,et al.  Bounded Finite State Controllers , 2003, NIPS.

[56]  Victor R. Lesser,et al.  Quantitative Modeling of Complex Computational Task Environments , 1993, AAAI.

[57]  Milind Tambe,et al.  A prototype infrastructure for distributed robot-agent-person teams , 2003, AAMAS '03.

[58]  Victor R. Lesser,et al.  Using self-diagnosis to adapt organizational structures , 2001, AGENTS '01.