论文信息 - Dec-POMDPs as Non-Observable MDPs

Dec-POMDPs as Non-Observable MDPs

A recent insight in the field of decentralized partially observable Markov decision processes (Dec-POMDPs) is that it is possible to convert a Dec-POMDP to a non-observable MDP, which is a special case of POMDP. This technical report provides an overview of this reduction and pointers to related literature.

Frans A. Oliehoek | Christopher Amato

[1] Shlomo Zilberstein,et al. Point-based backup for decentralized POMDPs: complexity and new algorithms , 2010, AAMAS.

[2] Hans S. Witsenhausen,et al. A standard form for sequential stochastic control , 1973, Mathematical systems theory.

[3] Frans A. Oliehoek,et al. Value-Based Planning for Teams of Agents in Stochastic Partially Observable Environments , 2010 .

[4] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..

[5] Guy Shani,et al. Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .

[6] Frans A. Oliehoek,et al. Heuristic search for identical payoff Bayesian games , 2010, AAMAS.

[7] Olivier Buffet,et al. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Optimally Solving Dec-POMDPs as Continuous-State MDPs , 2022 .

[8] Ashutosh Nayyar,et al. Optimal Control Strategies in Delayed Sharing Information Structures , 2010, IEEE Transactions on Automatic Control.

[9] Ashutosh Nayyar,et al. Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach , 2012, IEEE Transactions on Automatic Control.

[10] Zoran Zivkovic,et al. The planar two point algorithm , 2009 .

[11] François Charpillet,et al. MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.

[12] Arnoud Visser. A survey of the architecture of the communication library LCM for the monitoring and control of autonomous mobile robots , 2012 .

[13] Jonathan P. How,et al. Decentralized control of partially observable Markov decision processes , 2015, 52nd IEEE Conference on Decision and Control.

[14] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.

[15] Charles L. Isbell,et al. Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs , 2013, NIPS.

[16] Ashutosh Nayyar,et al. The Common-Information Approach to Decentralized Stochastic Control , 2014 .

[17] Shimon Whiteson,et al. Exploiting Structure in Cooperative Bayesian Games , 2012, UAI.

[18] Frans A. Oliehoek,et al. Sufficient Plan-Time Statistics for Decentralized POMDPs , 2013, IJCAI.

[19] Frans A. Oliehoek,et al. Decentralized POMDPs , 2012, Reinforcement Learning.

[20] Arnoud Visser,et al. UvA Rescue Technical Report: a description of the methods and algorithms implemented in the UvA Rescue code release , 2012 .

[21] Aditya Mahajan,et al. Decentralized stochastic control , 2013, Annals of Operations Research.

[22] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[23] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[24] Frans A. Oliehoek,et al. Incremental clustering and expansion for faster optimal planning in decentralized POMDPs , 2013 .