Recently, a theory for stochastic optimal control in non-linear dynamical systems in continuous space-time has been developed (Kappen, 2005). We apply this theory to collaborative multi-agent systems. The agents evolve according to a given non-linear dynamics with additive Wiener noise. Each agent can control its own dynamics. The goal is to minimize the accumulated joint cost, which consists of a state dependent term and a term that is quadratic in the control. We focus on systems of non-interacting agents that have to distribute themselves optimally over a number of targets, given a set of end-costs for the different possible agent-target combinations. We show that optimal control is the combinatorial sum of independent single-agent single-target optimal controls weighted by a factor proportional to the end-costs of the different combinations. Thus, multi-agent control is related to a standard graphical model inference problem. The additional computational cost compared to single-agent control is exponential in the tree-width of the graph specifying the combinatorial sum times the number of targets. We illustrate the result by simulations of systems with up to 42 agents.
[1]
David J. Spiegelhalter,et al.
Local computations with probabilities on graphical structures and their application to expert systems
,
1990
.
[2]
Robert F. Stengel,et al.
Optimal Control and Estimation
,
1994
.
[3]
Craig Boutilier,et al.
Planning, Learning and Coordination in Multiagent Decision Processes
,
1996,
TARK.
[4]
Carlos Guestrin,et al.
Multiagent Planning with Factored MDPs
,
2001,
NIPS.
[5]
Shobha Venkataraman,et al.
Context-specific multiagent coordination and planning with factored MDPs
,
2002,
AAAI/IAAI.
[6]
H. Kappen.
Linear theory for control of nonlinear stochastic systems.
,
2004,
Physical review letters.
[7]
H. Kappen.
Path integrals and symmetry breaking for optimal control theory
,
2005,
physics/0505066.