论文信息 - Scalable Multiagent Planning Using Probabilistic Inference - 字舞流文

Scalable Multiagent Planning Using Probabilistic Inference

Multiagent planning has seen much progress with the development of formal models such as Dec-POMDPs. However, the complexity of these models--NEXP-Complete even for two agents-- has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a scalable approximation w.r.t. the number of agents. This is achieved by constructing a graphical model in which likelihood maximization is equivalent to plan optimization. Using the Expectation-Maximization framework for likelihood maximization, we show that the necessary inference can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We derive a global update rule that combines these local inferences to monotonically increase the overall solution quality. Experiments on a large multiagent planning benchmark confirm the benefits of the new approach in terms of runtime and scalability.

Marc Toussaint | Shlomo Zilberstein | Akshat Kumar | S. Zilberstein | Marc Toussaint | Akshat Kumar

[1] Shlomo Zilberstein,et al. Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs , 2010, Autonomous Agents and Multi-Agent Systems.

[2] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[3] Patrick J. Roa. Volume 8 , 2001 .

[4] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.

[5] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.

[6] Milind Tambe,et al. Distributed Sensor Networks: A Multiagent Perspective , 2003 .

[7] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[8] Michael A. Saunders,et al. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization , 2002, SIAM J. Optim..

[9] Makoto Yokoo,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[10] P. J. Gmytrasiewicz,et al. A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[11] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12] Edmund H. Durfee,et al. Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.

[13] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .

[14] Marc Toussaint,et al. Hierarchical POMDP Controller Optimization by Likelihood Maximization , 2008, UAI.

[15] Shlomo Zilberstein,et al. Anytime Planning for Decentralized POMDPs using Expectation Maximization , 2010, UAI.

[16] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.

[17] Makoto Yokoo,et al. Not all agents are equal: scaling up distributed POMDPs for agent networks , 2008, AAMAS.

[18] Milind Tambe,et al. Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping , 2009, ICAPS.

[19] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..

[20] S. Zilberstein,et al. Event-detecting multi-agent MDPs: complexity and constant-factor approximation , 2009, IJCAI 2009.

[21] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.

[22] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[23] Claudia V. Goldman,et al. Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..