Lagrangian Relaxation for Large-Scale Multi-agent Planning

Multi-agent planning is a well-studied problem with various applications including disaster rescue, urban transportation and logistics, both for autonomous agents and for decision support to humans. Due to computational constraints, existing research typically focuses on one of two scenarios: unstructured domains with many agents where we are content with heuristic solutions, or domains with small numbers of agents or special structure where we can provide provably near-optimal solutions. By contrast, in this paper, we focus on providing provably near-optimal solutions for domains with large numbers of agents, by exploiting a common domain-general property: if individual agents each have limited influence on the overall solution quality, then we can take advantage of randomization and the resulting statistical concentration to show that each agent can safely plan based only on the average behavior of the other agents. To that end, we make two key contributions: (a) an algorithm, based on Lagrangian relaxation and randomized rounding, for solving multi-agent planning problems represented as large mixed-integer programs, (b) a proof of convergence of our algorithm to a near-optimal solution. We demonstrate the scalability of our approach with a large-scale illustrative theme park crowd management problem.

[1]  Kee-Eung Kim,et al.  Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.

[2]  Egon Balas,et al.  The prize collecting traveling salesman problem , 1989, Networks.

[3]  Philip Wolfe,et al.  Validation of subgradient optimization , 1974, Math. Program..

[4]  Satinder P. Singh,et al.  How to Dynamically Merge Markov Decision Processes , 1997, NIPS.

[5]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[6]  Dmitri A. Dolgov,et al.  Combinatorial resource scheduling for multiagent MDPs , 2007, AAMAS '07.

[7]  Marshall L. Fisher,et al.  An Applications Oriented Guide to Lagrangian Relaxation , 1985 .

[8]  M. Caramanis,et al.  Efficient Lagrangian relaxation algorithms for industry size job-shop scheduling problems , 1998 .

[9]  Shih-Fen Cheng,et al.  Decentralized decision support for an agent population in dynamic and uncertain domains , 2011, AAMAS.

[10]  Geoffrey J. Gordon Regret bounds for prediction problems , 1999, COLT '99.

[11]  Geoffrey J. Gordon,et al.  Distributed Planning in Hierarchical Factored MDPs , 2002, UAI.

[12]  Xiaohong Guan,et al.  Unit Commitment with Identical Units: Successive Subproblem Solving Method Based on Lagrangian Relaxation , 2002, IEEE Power Engineering Review.

[13]  Geoffrey J. Gordon,et al.  Optimal Distributed Market-Based Planning for Multi-Agent Systems with Shared Resources , 2011, AISTATS.

[14]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[15]  Geoffrey J. Gordon,et al.  No-regret learning and a mechanism for distributed multiagent planning , 2008, AAMAS.

[16]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[17]  Sven Koenig,et al.  Progress on Agent Coordination with Cooperative Auctions , 2010, AAAI.

[18]  Prasanna Velagapudi,et al.  Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents , 2011, AAMAS.

[19]  Evangelos Markakis,et al.  The Power of Sequential Single-Item Auctions for Agent Coordination , 2006, AAAI.