Compact Mathematical Programs For DEC-MDPs With Structured Agent Interactions

To deal with the prohibitive complexity of calculating policies in Decentralized MDPs, researchers have proposed models that exploit structured agent interactions. Settings where most agent actions are independent except for few actions that affect the transitions and/or rewards of other agents can be modeled using Event-Driven Interactions with Complex Rewards (EDI-CR). Finding the optimal joint policy can be formulated as an optimization problem. However, existing formulations are too verbose and/or lack optimality guarantees. We propose a compact Mixed Integer Linear Program formulation of EDI-CR instances. The key insight is that most action sequences of a group of agents have the same effect on a given agent. This allows us to treat these sequences similarly and use fewer variables. Experiments show that our formulation is more compact and leads to faster solution times and better solutions than existing formulations.

[1]  Alain Dutech,et al.  An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs , 2014, J. Artif. Intell. Res..

[2]  Christodoulos A. Floudas,et al.  Finding all solutions of nonlinearly constrained systems of equations , 1995, J. Glob. Optim..

[3]  Milind Tambe,et al.  Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping , 2009, ICAPS.

[4]  D. Koller,et al.  Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .

[5]  Francisco S. Melo,et al.  Interaction-driven Markov games for decentralized multiagent planning under uncertainty , 2008, AAMAS.

[6]  Sonia Cafieri,et al.  Comparison of convex relaxations of quadrilinear terms , 2009 .

[7]  Jianhui Wu,et al.  Mixed-integer linear programming for transition-independent decentralized MDPs , 2006, AAMAS '06.

[8]  Edmund H. Durfee,et al.  Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.

[9]  Victor R. Lesser,et al.  Offline Planning for Communication by Exploiting Structured Interactions in Decentralized MDPs , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[10]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[11]  François Charpillet,et al.  Quadratic Programming for Multi-Target Tracking , 2009 .

[12]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.