Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs

In many real-world multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent's limited interactions with a small number of neighbors. While distributed POMDPs capture the real-world uncertainty in multiagent domains, they fail to exploit such locality of interaction. Distributed constraint optimization (DCOP) captures the locality of interaction but fails to capture planning under uncertainty. This paper present a new model synthesized from distributed POMDPs and DCOPs, called Networked Distributed POMDPs (ND-POMDPs). Exploiting network structure enables us to present two novel algorithms for ND-POMDPs: a distributed policy generation algorithm that performs local search and a systematic policy search that is guaranteed to reach the global optimal.

[1]  Rina Dechter,et al.  Constraint Processing , 1995, Lecture Notes in Computer Science.

[2]  M. Yokoo,et al.  Distributed Breakout Algorithm for Solving Distributed Constraint Satisfaction Problems , 1996 .

[3]  Makoto Yokoo,et al.  Distributed constraint satisfaction algorithm for complex local problems , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).

[4]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[5]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.

[6]  Shobha Venkataraman,et al.  Context-specific multiagent coordination and planning with factored MDPs , 2002, AAAI/IAAI.

[7]  Milind Tambe,et al.  Distributed Sensor Networks: A Multiagent Perspective , 2003 .

[8]  Makoto Yokoo,et al.  An asynchronous complete method for distributed constraint optimization , 2003, AAMAS '03.

[9]  Makoto Yokoo,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[10]  Edmund H. Durfee,et al.  Graphical models in local, asymmetric multi-agent Markov decision processes , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[11]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[12]  Shlomo Zilberstein,et al.  Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[13]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[14]  Claudia V. Goldman,et al.  Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..