Heuristic Planning for Decentralized MDPs with Sparse Interactions

In this work, we explore how local interactions can simplify the process of decision-making in multiagent systems, particularly in multirobot problems. We review a recent decision-theoretic model for multiagent systems, the decentralized sparse-interaction Markov decision process (Dec-SIMDP), that explicitly distinguishes the situations in which the agents in the team must coordinate from those in which they can act independently. We situate this class of problems within different multiagent models, such as MMDPs and transition independent Dec-MDPs. We then contribute a new general approach that leverages the particular structure of Dec-SIMDPs to efficiently plan in this class of problems, and propose two algorithms based on this underlying approach. We pinpoint the main properties of our approach through illustrative examples in multirobot navigation domains with partial observability, and provide empirical comparisons between our algorithms and other existing algorithms for this class of problems. We show that our approach allows the robots to look ahead for possible interactions, planning to accommodate such interactions and thus overcome some of the limitations of previous methods.

[1]  Milind Tambe,et al.  Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping , 2009, ICAPS.

[2]  Sridhar Mahadevan,et al.  Hierarchical Multiagent Reinforcement Learning , 2004 .

[3]  Francisco S. Melo,et al.  Local Multiagent Coordination in Decentralized MDPs with Sparse Interactions , 2010 .

[4]  Peter Stone,et al.  Layered Learning in Multiagent Systems , 1997, AAAI/IAAI.

[5]  Claudia V. Goldman,et al.  Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..

[6]  Francisco S. Melo,et al.  Interaction-driven Markov games for decentralized multiagent planning under uncertainty , 2008, AAMAS.

[7]  Shlomo Zilberstein,et al.  Policy Iteration for Decentralized Control of Markov Decision Processes , 2009, J. Artif. Intell. Res..

[8]  Michael L. Littman,et al.  Graphical Models for Game Theory , 2001, UAI.

[9]  Victor R. Lesser,et al.  Offline Planning for Communication by Exploiting Structured Interactions in Decentralized MDPs , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[10]  Lynne E. Parker,et al.  ALLIANCE: an architecture for fault tolerant multirobot cooperation , 1998, IEEE Trans. Robotics Autom..

[11]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[12]  Manuela M. Veloso,et al.  Exploiting factored representations for decentralized execution in multiagent teams , 2007, AAMAS '07.

[13]  Anne Condon,et al.  On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.

[14]  Kevin Leyton-Brown,et al.  Action-Graph Games , 2011, Games Econ. Behav..

[15]  Sridhar Mahadevan,et al.  Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.

[16]  Martin Allen,et al.  Complexity of Decentralized Control: Special Cases , 2009, NIPS.

[17]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[18]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[19]  Nikos A. Vlassis,et al.  Utile Coordination: Learning Interdependencies Among Cooperative Agents , 2005, CIG.

[20]  Carlos Guestrin,et al.  Multiagent Planning with Factored MDPs , 2001, NIPS.

[21]  Maja J. Mataric,et al.  Sold!: auction methods for multirobot coordination , 2002, IEEE Trans. Robotics Autom..