General principles of learning-based multi-agent systems

We consider the problem of how to design large decentralized multiagent systems (MAS’s) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a reinforcement learning algorithm. This converts the problem into one of how to automatically set/update the reward functions for each of the agents so that the global goal is achieved. In particular we do not want the agents to “work at cross-purposes” as far as the global goal is concerned. We use the term artificial COllective INtelligence (COIN) to refer to systems that embody solutions to this problem. In this paper we present a summary of a mathematical framework for COINs. We then investigate the real-world applicability of the core concepts of that framework via two computer experiments: we show that our COINs perform near optimally in a difficult variant of Arthur’s bar problem [1] (and in particular avoid the tragedy of the commons for that problem), and we also illustrate optimal performance for our COINs in the leader-follower problem.

[1]  J. Davenport Editor , 1960 .

[2]  G. Hardin,et al.  The Tragedy of the Commons , 1968, Green Planet Blues.

[3]  Drew Fudenberg,et al.  Game theory (3. pr.) , 1991 .

[4]  W. Arthur Complexity in economic theory: inductive reasoning and bounded rationality , 1994 .

[5]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[6]  A. Roadmapof A Roadmap of Agent Research and Development , 1995 .

[7]  Jeffrey M. Bradshaw,et al.  Software agents , 1997 .

[8]  Craig Boutilier,et al.  Economic Principles of Multi-Agent Systems , 1997, Artif. Intell..

[9]  M. Marsili,et al.  A Prototype Model of Stock Exchange , 1997, cond-mat/9709118.

[10]  Yi-Cheng Zhang,et al.  Emergence of cooperation and organization in an evolutionary game , 1997 .

[11]  Y. Shoham,et al.  Editorial: economic principles of multi-agent systems , 1997 .

[12]  Onn Shehory,et al.  Anytime Coalition Structure Generation with Worst Case Guarantees , 1998, AAAI/IAAI.

[13]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[14]  Kagan Tumer,et al.  Distributed Control with Collective Intelligence , 1998 .

[15]  Michael P. Wellman,et al.  Online learning about other agents in a dynamic multiagent system , 1998, AGENTS '98.

[16]  E. B. Baum,et al.  Manifesto for an evolutionary economics of intelligence , 1998 .

[17]  P. M. Hui,et al.  Volatility and agent adaptability in a self-organizing market , 1998, cond-mat/9802177.

[18]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[19]  Kagan Tumer,et al.  Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.

[20]  A. Engel,et al.  Matrix Games, Mixed Strategies, and Statistical Mechanics , 1998, cond-mat/9809265.

[21]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[22]  Kagan Tumer,et al.  Collective Intelligence for Control of Distributed Dynamical Systems , 1999, ArXiv.

[23]  Gerhard Weiss,et al.  Multiagent Systems , 1999 .