Factored Upper Bounds for Multiagent Planning Problems under Uncertainty with Non-Factored Value Functions

Nowadays, multiagent planning under uncertainty scales to tens or even hundreds of agents. However, current methods either are restricted to problems with factored value functions, or provide solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions, which would additionally allow for meaningful benchmarking of methods of the latter category. To mitigate this problem, this paper introduces a family of influence-optimistic upper bounds for factored Dec-POMDPs without factored value functions. We demonstrate how we can achieve firm quality guarantees for problems with hundreds of agents.

[1]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[2]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[3]  Claudia V. Goldman,et al.  The complexity of multiagent systems: the price of silence , 2003, AAMAS '03.

[4]  Claudia V. Goldman,et al.  Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[5]  Victor R. Lesser,et al.  Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[6]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[7]  Makoto Yokoo,et al.  Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies , 2007, AAMAS '07.

[8]  Makoto Yokoo,et al.  Not all agents are equal: scaling up distributed POMDPs for agent networks , 2008, AAMAS.

[9]  Shlomo Zilberstein,et al.  Constraint-based dynamic programming for decentralized POMDPs with structured interactions , 2009, AAMAS.

[10]  S. Zilberstein,et al.  Event-detecting multi-agent MDPs: complexity and constant-factor approximation , 2009, IJCAI 2009.

[11]  Edmund H. Durfee,et al.  Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.

[12]  Pascal Poupart,et al.  Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.

[13]  Pradeep Varakantham,et al.  Risk-sensitive planning in partially observable environments , 2010, AAMAS.

[14]  Milind Tambe,et al.  Continuous Time Planning for Multiagent Teams with Temporal Constraints , 2011, IJCAI.

[15]  Edmund H. Durfee,et al.  Abstracting Influences for Efficient Multiagent Coordination Under Uncertainty , 2011 .

[16]  Marc Toussaint,et al.  Scalable Multiagent Planning Using Probabilistic Inference , 2011, IJCAI.

[17]  Pedro U. Lima,et al.  Efficient Offline Communication Policies for Factored Multiagent POMDPs , 2011, NIPS.

[18]  Prasanna Velagapudi,et al.  Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents , 2011, AAMAS.

[19]  Guy Shani,et al.  Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .

[20]  Leslie Pack Kaelbling,et al.  Heuristic search of multiagent influence space , 2012, AAMAS.

[21]  Shih-Fen Cheng,et al.  Decision Support for Agent Populations in Uncertain and Congested Environments , 2012, AAAI.

[22]  Leslie Pack Kaelbling,et al.  Influence-Based Abstraction for Multiagent Systems , 2012, AAAI.

[23]  Frans A. Oliehoek,et al.  Incremental clustering and expansion for faster optimal planning in decentralized POMDPs , 2013 .

[24]  Charles L. Isbell,et al.  Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs , 2013, NIPS.

[25]  Feng Wu,et al.  Monte-Carlo Expectation Maximization for Decentralized POMDPs , 2013, IJCAI.

[26]  Shimon Whiteson,et al.  Approximate solutions for factored Dec-POMDPs with many agents , 2013, AAMAS.

[27]  Shimon Whiteson,et al.  Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs , 2013, J. Artif. Intell. Res..

[28]  Olivier Buffet,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Optimally Solving Dec-POMDPs as Continuous-State MDPs , 2022 .

[29]  Patrick Jaillet,et al.  Decentralized Stochastic Planning with Anonymity in Interactions , 2014, AAAI.

[30]  Olivier Buffet,et al.  Exploiting separability in multiagent planning with continuous-state MDPs , 2014, AAMAS.

[31]  Dec-POMDPs as Non-Observable MDPs , 2014 .

[32]  Frans A. Oliehoek,et al.  Influence-Optimistic Local Values for Multiagent Planning , 2015, AAMAS.

[33]  Frans A. Oliehoek,et al.  Influence-Optimistic Local Values for Multiagent Planning - Extended Version , 2015, ArXiv.