论文信息 - Exploiting Locality of Interaction in Networked Distributed POMDPs

Exploiting Locality of Interaction in Networked Distributed POMDPs

In many real-world multiagent applications such as distributed sensor nets, a network of agents is formed based on each agent’s limited interactions with a small number of neighbors. While distributed POMDPs capture the real-world uncertainty in multiagent domains, they fail to exploit such locality of interaction. Distributed constraint optimization (DCOP) captures the locality of interaction but fails to capture planning under uncertainty. In previous work, we presented a model synthesized from distributed POMDPs and DCOPs, called Networked Distributed POMDPs (ND-POMDPs). Also, we presented LID-JESP (locally interacting distributed joint equilibrium-based search for policies: a distributed policy generation algorithm based on DBA (distributed breakout algorithm). In this paper, we present a stochastic variation of the LID-JESP that is based on DSA (distributed stochastic algorithm) that allows neighboring agents to change their policies in the same cycle. Through detailed experiments, we show how this can result in speedups without a large difference in solution quality. We also introduce a technique called hyper-linkbased decomposition that allows us to exploit locality of interaction further, resulting in faster run times for both LID-JESP and its stochastic variant without any loss in solution quality.

[1] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.

[2] M. Yokoo,et al. Distributed Breakout Algorithm for Solving Distributed Constraint Satisfaction Problems , 1996 .

[3] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[4] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[5] Weixiong Zhang,et al. An analysis and application of distributed constraint satisfaction and optimization algorithms in sensor networks , 2003, AAMAS '03.

[6] Milind Tambe,et al. Distributed Sensor Networks: A Multiagent Perspective , 2003 .

[7] Claudia V. Goldman,et al. Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[8] Makoto Yokoo,et al. An asynchronous complete method for distributed constraint optimization , 2003, AAMAS '03.

[9] Edmund H. Durfee,et al. Graphical models in local, asymmetric multi-agent Markov decision processes , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[10] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.

[11] Claudia V. Goldman,et al. Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..

[12] Jeff G. Schneider,et al. Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[13] Shobha Venkataraman,et al. Context-specific multiagent coordination and planning with factored MDPs , 2002, AAAI/IAAI.

[14] Makoto Yokoo,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.