Dec-POMDPs as Non-Observable MDPs

A recent insight in the field of decentralized partially observable Markov decision processes (Dec-POMDPs) is that it is possible to convert a Dec-POMDP to a non-observable MDP, which is a special case of POMDP. This technical report provides an overview of this reduction and pointers to related literature.

[1]  Shlomo Zilberstein,et al.  Point-based backup for decentralized POMDPs: complexity and new algorithms , 2010, AAMAS.

[2]  Hans S. Witsenhausen,et al.  A standard form for sequential stochastic control , 1973, Mathematical systems theory.

[3]  Frans A. Oliehoek,et al.  Value-Based Planning for Teams of Agents in Stochastic Partially Observable Environments , 2010 .

[4]  Nikos A. Vlassis,et al.  Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..

[5]  Guy Shani,et al.  Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .

[6]  Frans A. Oliehoek,et al.  Heuristic search for identical payoff Bayesian games , 2010, AAMAS.

[7]  Olivier Buffet,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Optimally Solving Dec-POMDPs as Continuous-State MDPs , 2022 .

[8]  Ashutosh Nayyar,et al.  Optimal Control Strategies in Delayed Sharing Information Structures , 2010, IEEE Transactions on Automatic Control.

[9]  Ashutosh Nayyar,et al.  Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach , 2012, IEEE Transactions on Automatic Control.

[10]  Zoran Zivkovic,et al.  The planar two point algorithm , 2009 .

[11]  François Charpillet,et al.  MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.

[12]  Arnoud Visser A survey of the architecture of the communication library LCM for the monitoring and control of autonomous mobile robots , 2012 .

[13]  Jonathan P. How,et al.  Decentralized control of partially observable Markov decision processes , 2015, 52nd IEEE Conference on Decision and Control.

[14]  Kee-Eung Kim,et al.  Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.

[15]  Charles L. Isbell,et al.  Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs , 2013, NIPS.

[16]  Ashutosh Nayyar,et al.  The Common-Information Approach to Decentralized Stochastic Control , 2014 .

[17]  Shimon Whiteson,et al.  Exploiting Structure in Cooperative Bayesian Games , 2012, UAI.

[18]  Frans A. Oliehoek,et al.  Sufficient Plan-Time Statistics for Decentralized POMDPs , 2013, IJCAI.

[19]  Frans A. Oliehoek,et al.  Decentralized POMDPs , 2012, Reinforcement Learning.

[20]  Arnoud Visser,et al.  UvA Rescue Technical Report: a description of the methods and algorithms implemented in the UvA Rescue code release , 2012 .

[21]  Aditya Mahajan,et al.  Decentralized stochastic control , 2013, Annals of Operations Research.

[22]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[23]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[24]  Frans A. Oliehoek,et al.  Incremental clustering and expansion for faster optimal planning in decentralized POMDPs , 2013 .