论文信息 - Solving Continuous-Time Transition-Independent DEC-MDP with Temporal Constraints

Solving Continuous-Time Transition-Independent DEC-MDP with Temporal Constraints

Despite the impact of DEC-MDPs over the past decade, scal-ing to large problem domains has been dicult to achieve.The scale-up problem is exacerbated in DEC-MDPs withcontinuous states, which are critical in domains involvingtime; the latest algorithm (M-DPFP) does not scale-up be-yond two agents and a handful of unordered tasks per agent.This paper is focused on meeting this challenge in contin-uous resource DEC-MDPs with two predominant contribu-tions. First, it introduces a novel continuous time model formulti-agent planning problems that exploits transition in-dependence in domains with graphical agent dependenciesand temporal constraints. More importantly, it presents anew, iterative, locally optimal algorithm called SPAC thatis a combination of the following key ideas: (1) de ninga novel augmented CT-MDP such that solving this single-agent continuous time MDP provably provides an automaticbest response to neighboring agents’ policies; (2) fast con-volution to eciently generate such augmented MDPs; (3)new enhanced lazy approximation algorithm to solve theseaugmented MDPs; (4) intelligent seeding of initial policiesin the iterative process; (5) exploiting graph structure ofreward dependencies to exploit local interactions for scala-bility. Our experiments show SPAC not only nds solutionssubstantially faster than M-DPFP with comparable quality,but also scales well to large teams of agents.

Milind Tambe | Zhengyu Yin | Kanna Rajan

[1] Abdel-Illah Mouaddib,et al. A polynomial algorithm for decentralized Markov decision processes with temporal constraints , 2005, AAMAS '05.

[2] Claudia V. Goldman,et al. Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[3] Shlomo Zilberstein,et al. Point-based backup for decentralized POMDPs: complexity and new algorithms , 2010, AAMAS.

[4] Lihong Li,et al. Lazy Approximation for Solving Continuous Finite-Horizon MDPs , 2005, AAAI.

[5] Emmanuel Benazera. Solving Decentralized Continuous Markov Decision Problems with Structured Reward , 2007, KI.

[6] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[7] Feng Wu,et al. Point-based policy generation for decentralized POMDPs , 2010, AAMAS.

[8] Claudia V. Goldman,et al. Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[9] Francisco S. Melo,et al. Interaction-driven Markov games for decentralized multiagent planning under uncertainty , 2008, AAMAS.

[10] Frederic Py,et al. A systematic agent framework for situated autonomous systems , 2010, AAMAS.

[11] J. G. Bellingham,et al. Guest editorial - autonomous ocean-sampling networks , 2001 .

[12] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[13] Milind Tambe,et al. Planning with continuous resources for agent teams , 2009, AAMAS.

[14] Yifeng Zeng,et al. Graphical models for interactive POMDPs: representations and solutions , 2009, Autonomous Agents and Multi-Agent Systems.

[15] Milind Tambe,et al. On opportunistic techniques for solving decentralized Markov decision processes with temporal constraints , 2007, AAMAS '07.