Optimal Wonderful Life Utility Functions in Multi-Agent Systems

The mathematics of Collective Intelligence (COINs) is concerned with the design of multi-agent systems so as to optimize an overall global utility function when those systems lack centralized communication and control. Typically in COINs each agent runs a distinct Reinforcement Learning (RL) algorithm, so that much of the design problem reduces to how best to initialize/update each agent's private utility function, as far as the ensuing value of the global utility is concerned. Traditional team game solutions to this problem assign to each agent the global utility as its private utility function. In previous work we used the COIN framework to derive the alternative Wonderful Life Utility (WLU), and experimentally established that having the agents use it induces global utility performance up to orders of magnitude superior to that induced by use of the team game utility. The WLU has a free parameter (the clamping parameter) which we simply set to zero in that previous work. Here we derive the optimal value of the clamping parameter, and demonstrate experimentally that using that optimal value can result in significantly improved performance over that of clamping to zero, over and above the improvement beyond traditional approaches.

[1]  Kagan Tumer,et al.  General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[2]  W. Arthur Complexity in economic theory: inductive reasoning and bounded rationality , 1994 .

[3]  Onn Shehory,et al.  Anytime Coalition Structure Generation with Worst Case Guarantees , 1998, AAAI/IAAI.

[4]  K. Sycara,et al.  This Is a Publication of the American Association for Artificial Intelligence Multiagent Systems Multiagent System Issues and Challenges Individual Agent Reasoning Task Allocation Multiagent Planning Recognizing and Resolving Conflicts Managing Communication Modeling Other Agents Managing Resources , 2022 .

[5]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[6]  Kagan Tumer,et al.  Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.

[7]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[8]  E. B. Baum,et al.  Manifesto for an evolutionary economics of intelligence , 1998 .

[9]  Gerhard Weiss,et al.  Multiagent Systems , 1999 .

[10]  Michael R. Genesereth,et al.  Software agents , 1994, CACM.

[11]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[12]  M. Marsili,et al.  A Prototype Model of Stock Exchange , 1997, cond-mat/9709118.

[13]  A. Roadmapof A Roadmap of Agent Research and Development , 1995 .

[14]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[15]  Y. Shoham,et al.  Editorial: economic principles of multi-agent systems , 1997 .

[16]  Kagan Tumer,et al.  An Introduction to Collective Intelligence , 1999, ArXiv.

[17]  G. Hardin,et al.  The Tragedy of the Commons , 1968, Green Planet Blues.

[18]  Yicheng Zhang,et al.  On the minority game: Analytical and numerical studies , 1998, cond-mat/9805084.

[19]  Kagan Tumer,et al.  Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.

[20]  Craig Boutilier Multiagent Systems: Challenges and Opportunities for Decision-Theoretic Planning , 1999, AI Mag..