论文信息 - Adaptive, distributed control of constrained multi-agent systems

Adaptive, distributed control of constrained multi-agent systems

Product Distribution (PD) theory was recently developed as a framework for analyzing and optimizing distributed systems. In this paper we demonstrate its use for adaptive distributed control of Multi-Agent Systems (MASýs), i.e., for distributed stochastic optimization using MASýs. One common way to perform the optimization is to have each agent run a Reinforcement Learning (RL) algorithm. PD theory provides an alternative based upon using a variant of Newtonýs method operating on the agentýs probability distributions. We compare this alternative to RL-based search in three sets of computer experiments. The PD-theory-based approach outperforms the RL-based scheme in all three domains.

David H. Wolpert | Stefan Bieniawski | Stefan R. Bieniawski | D. Wolpert

[1] John E. Beasley,et al. OR-Library: Distributing Test Problems by Electronic Mail , 1990 .

[2] David H. Wolpert,et al. Product distribution theory for control of multi-agent systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[3] Stefan R. Bieniawski,et al. Adaptive Multi-Agent Systems for Constrained Optimization , 2004 .

[4] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.

[5] Kagan Tumer,et al. Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[6] Robert B. Ash,et al. Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[7] Dimitri P. Bertsekas,et al. Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[8] Rok Sosic,et al. A polynomial time algorithm for the N-Queens problem , 1990, SGAR.

[9] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[10] Ilan Kroo,et al. Fleet Assignment Using Collective Intelligence , 2004 .

[11] Kagan Tumer,et al. Collective Intelligence, Data Routing and Braess' Paradox , 2002, J. Artif. Intell. Res..

[12] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[13] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[14] David H. Wolpert,et al. The design of collectives of agents to control non-Markovian systems , 2002, AAAI/IAAI.

[15] W. Hamilton,et al. The evolution of cooperation. , 1984, Science.

[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17] L. Goddard. Information Theory , 1962, Nature.

[18] Kagan Tumer,et al. General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[19] D. Wolpert,et al. Product Distribution Theory and Semi-Coordinate Transformations , 2004 .

[20] Thomas M. Cover,et al. Elements of Information Theory , 2005 .