Influence-Optimistic Local Values for Multiagent Planning

Over the last decade, methods for multiagent planning under uncertainty have increased in scalability. However, many methods assume value factorization or are not able to provide quality guarantees. We propose a novel family of influence-optimistic upper bounds on the optimal value for problems with 100s of agents that do not exhibit value factorization.

[1]  Nikos A. Vlassis,et al.  Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..

[2]  Prasanna Velagapudi,et al.  Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents , 2011, AAMAS.

[3]  S. Zilberstein,et al.  Event-detecting multi-agent MDPs: complexity and constant-factor approximation , 2009, IJCAI 2009.

[4]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[5]  Claudia V. Goldman,et al.  The complexity of multiagent systems: the price of silence , 2003, AAMAS '03.

[6]  Matthijs T. J. Spaan,et al.  Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.

[7]  Olivier Buffet,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Optimally Solving Dec-POMDPs as Continuous-State MDPs , 2022 .

[8]  Frans A. Oliehoek,et al.  Influence-Optimistic Local Values for Multiagent Planning - Extended Version , 2015, ArXiv.

[9]  Frans A. Oliehoek,et al.  Incremental clustering and expansion for faster optimal planning in decentralized POMDPs , 2013 .

[10]  Scott Sanner,et al.  Solutions to Factored MDPs with Imprecise Transition Probabilities 1 , 2011 .

[11]  Olivier Buffet,et al.  Exploiting separability in multiagent planning with continuous-state MDPs , 2014, AAMAS.

[12]  Charles L. Isbell,et al.  Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs , 2013, NIPS.

[13]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[14]  Pedro U. Lima,et al.  Efficient Offline Communication Policies for Factored Multiagent POMDPs , 2011, NIPS.

[15]  Pradeep Varakantham,et al.  Risk-sensitive planning in partially observable environments , 2010, AAMAS.

[16]  François Charpillet,et al.  MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.

[17]  Milind Tambe,et al.  Continuous Time Planning for Multiagent Teams with Temporal Constraints , 2011, IJCAI.

[18]  Makoto Yokoo,et al.  Not all agents are equal: scaling up distributed POMDPs for agent networks , 2008, AAMAS.

[19]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[20]  Shimon Whiteson,et al.  Exploiting locality of interaction in factored Dec-POMDPs , 2008, AAMAS.

[21]  Guy Shani,et al.  Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .

[22]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[23]  Leslie Pack Kaelbling,et al.  Heuristic search of multiagent influence space , 2012, AAMAS.

[24]  Makoto Yokoo,et al.  Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies , 2007, AAMAS '07.

[25]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[26]  Frans A. Oliehoek,et al.  Best Response Bayesian Reinforcement Learning for Multiagent Systems with State Uncertainty , 2014 .

[27]  Patrick Jaillet,et al.  Decentralized Stochastic Planning with Anonymity in Interactions , 2014, AAAI.

[28]  Shlomo Zilberstein,et al.  Constraint-based dynamic programming for decentralized POMDPs with structured interactions , 2009, AAMAS.

[29]  Aditya Mahajan,et al.  Decentralized stochastic control , 2013, Annals of Operations Research.

[30]  Leslie Pack Kaelbling,et al.  Influence-Based Abstraction for Multiagent Systems , 2012, AAAI.

[31]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[32]  F. Oliehoek,et al.  Dec-POMDPs as Non-Observable MDPs , 2014 .

[33]  Manuela M. Veloso,et al.  Reasoning about joint beliefs for execution-time communication decisions , 2005, AAMAS '05.

[34]  Edmund H. Durfee,et al.  Towards a unifying characterization for quantifying weak coupling in dec-POMDPs , 2011, AAMAS.

[35]  Edmund H. Durfee,et al.  Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.

[36]  G. W. Wornell,et al.  Decentralized control of a multiple access broadcast channel: performance bounds , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[37]  Feng Wu,et al.  Monte-Carlo Expectation Maximization for Decentralized POMDPs , 2013, IJCAI.

[38]  Craig Boutilier,et al.  Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.

[39]  Shimon Whiteson,et al.  Approximate solutions for factored Dec-POMDPs with many agents , 2013, AAMAS.

[40]  Edmund H. Durfee,et al.  Abstracting Influences for Efficient Multiagent Coordination Under Uncertainty , 2011 .

[41]  Marc Toussaint,et al.  Scalable Multiagent Planning Using Probabilistic Inference , 2011, IJCAI.

[42]  Claudia V. Goldman,et al.  Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[43]  Francisco S. Melo,et al.  A Flexible Approach to Modeling Unpredictable Events in MDPs , 2013, ICAPS.