Asynchronous Execution in Multiagent POMDPs: Reasoning over Partially-Observable Events

This paper proposes a novel modeling approach to problems of multiagent decision-making under partial observability, building on the framework of Multiagent Partial Observable Markov Decision Processes (MPOMDPs). Unfortunately, the size of MPOMDP models (and their solutions) grows exponentially in the number of agents, and agents are required to act in synchrony. In the present work, we show how these problems can be mitigated through an eventdriven, asynchronous formulation of the MPOMDP dynamics. We introduce the necessary extensions to the dynamics and solution algorithms of standard MPOMDPs. In particular, we prove that the optimal value function in our EventDriven Multiagent POMDP framework is piecewise linear and convex, allowing us to extend a standard point-based solver to the event-driven setting. Finally, we present simulation results, showing the computational savings of our modeling approach.

[1]  Christos G. Cassandras,et al.  Introduction to Discrete Event Systems , 1999, The Kluwer International Series on Discrete Event Dynamic Systems.

[2]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[3]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[4]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[5]  Milind Tambe,et al.  The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..

[6]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[7]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[8]  Victor R. Lesser,et al.  Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[9]  David Hsu,et al.  SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[10]  P. Poupart Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[11]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[12]  Douglas H. Norrie,et al.  Agent-Based Systems for Intelligent Manufacturing: A State-of-the-Art Survey , 1999, Knowledge and Information Systems.

[13]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[14]  Feng Wu,et al.  Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs , 2008, RoboCup.

[15]  Nikos A. Vlassis,et al.  Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..