Adaptive, distributed control of constrained multi-agent systems

Product Distribution (PD) theory was recently developed as a framework for analyzing and optimizing distributed systems. In this paper we demonstrate its use for adaptive distributed control of Multi-Agent Systems (MASýs), i.e., for distributed stochastic optimization using MASýs. One common way to perform the optimization is to have each agent run a Reinforcement Learning (RL) algorithm. PD theory provides an alternative based upon using a variant of Newtonýs method operating on the agentýs probability distributions. We compare this alternative to RL-based search in three sets of computer experiments. The PD-theory-based approach outperforms the RL-based scheme in all three domains.

[1]  John E. Beasley,et al.  OR-Library: Distributing Test Problems by Electronic Mail , 1990 .

[2]  David H. Wolpert,et al.  Product distribution theory for control of multi-agent systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[3]  Stefan R. Bieniawski,et al.  Adaptive Multi-Agent Systems for Constrained Optimization , 2004 .

[4]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[5]  Kagan Tumer,et al.  Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[6]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[7]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[8]  Rok Sosic,et al.  A polynomial time algorithm for the N-Queens problem , 1990, SGAR.

[9]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[10]  Ilan Kroo,et al.  Fleet Assignment Using Collective Intelligence , 2004 .

[11]  Kagan Tumer,et al.  Collective Intelligence, Data Routing and Braess' Paradox , 2002, J. Artif. Intell. Res..

[12]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[13]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[14]  David H. Wolpert,et al.  The design of collectives of agents to control non-Markovian systems , 2002, AAAI/IAAI.

[15]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  L. Goddard Information Theory , 1962, Nature.

[18]  Kagan Tumer,et al.  General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[19]  D. Wolpert,et al.  Product Distribution Theory and Semi-Coordinate Transformations , 2004 .

[20]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .