论文信息 - Factored Upper Bounds for Multiagent Planning Problems under Uncertainty with Non-Factored Value Functions

Factored Upper Bounds for Multiagent Planning Problems under Uncertainty with Non-Factored Value Functions

Nowadays, multiagent planning under uncertainty scales to tens or even hundreds of agents. However, current methods either are restricted to problems with factored value functions, or provide solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions, which would additionally allow for meaningful benchmarking of methods of the latter category. To mitigate this problem, this paper introduces a family of influence-optimistic upper bounds for factored Dec-POMDPs without factored value functions. We demonstrate how we can achieve firm quality guarantees for problems with hundreds of agents.

[1] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[2] Vijay V. Vazirani,et al. Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[3] Claudia V. Goldman,et al. The complexity of multiagent systems: the price of silence , 2003, AAMAS '03.

[4] Claudia V. Goldman,et al. Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[5] Victor R. Lesser,et al. Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[6] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[7] Makoto Yokoo,et al. Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies , 2007, AAMAS '07.

[8] Makoto Yokoo,et al. Not all agents are equal: scaling up distributed POMDPs for agent networks , 2008, AAMAS.

[9] Shlomo Zilberstein,et al. Constraint-based dynamic programming for decentralized POMDPs with structured interactions , 2009, AAMAS.

[10] S. Zilberstein,et al. Event-detecting multi-agent MDPs: complexity and constant-factor approximation , 2009, IJCAI 2009.

[11] Edmund H. Durfee,et al. Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.

[12] Pascal Poupart,et al. Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.

[13] Pradeep Varakantham,et al. Risk-sensitive planning in partially observable environments , 2010, AAMAS.

[14] Milind Tambe,et al. Continuous Time Planning for Multiagent Teams with Temporal Constraints , 2011, IJCAI.

[15] Edmund H. Durfee,et al. Abstracting Influences for Efficient Multiagent Coordination Under Uncertainty , 2011 .

[16] Marc Toussaint,et al. Scalable Multiagent Planning Using Probabilistic Inference , 2011, IJCAI.

[17] Pedro U. Lima,et al. Efficient Offline Communication Policies for Factored Multiagent POMDPs , 2011, NIPS.