论文信息 - Optimal Payoff Functions for Members of Collectives - 字舞流文

Optimal Payoff Functions for Members of Collectives

We consider the problem of designing (perhaps massively distributed) collectives of computational processes to maximize a provided "world utility" function. We consider this problem when the behavior of each process in the collective can be cast as striving to maximize its own payoff utility function. For such cases the central design issue is how to initialize/update those payoff utility functions of the individual processes so as to induce behavior of the entire collective having good values of the world utility. Traditional "team game" approaches to this problem simply assign to each process the world utility as its payoff utility function. In previous work we used the "Collective Intelligence" (COIN) framework to derive a better choice of payoff utility functions, one that results in world utility performance up to orders of magnitude superior to that ensuing from the use of the team game utility. In this paper, we extend these results using a novel mathematical framework. Under that new framework we review the derivation of the general class of payoff utility functions that both (i) are easy for the individual processes to try to maximize, and (ii) have the property that if good values of them are achieved, then we are assured a high value of world utility. These are the "Aristocrat Utility" and a new variant of the "Wonderful Life Utility" that was introduced in the previous COIN work. We demonstrate experimentally that using these new utility functions can result in significantly improved performance over that of previously investigated COIN payoff utilities, over and above those previous utilities' superiority to the conventional team game utility. These results also illustrate the substantial superiority of these payoff functions to perhaps the most natural version of the economics technique of "endogenizing externalities."

Kagan Tumer | David H. Wolpert | D. Wolpert | Kagan Tumer

[1] J. Davenport. Editor , 1960 .

[2] Michael P. Wellman. A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[3] Michael R. Genesereth,et al. Software agents , 1994, CACM.

[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[5] W. Arthur. Complexity in economic theory: inductive reasoning and bounded rationality , 1994 .

[6] L. Shapley,et al. Potential Games , 1994 .

[7] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[8] A. Mas-Colell,et al. Microeconomic Theory , 1995 .

[9] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[10] Robert H. Crites,et al. Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[11] L. Shapley,et al. REGULAR ARTICLEPotential Games , 1996 .

[12] Craig Boutilier,et al. Economic Principles of Multi-Agent Systems , 1997, Artif. Intell..

[13] M. Marsili,et al. A Prototype Model of Stock Exchange , 1997, cond-mat/9709118.

[14] Y. Shoham,et al. Editorial: economic principles of multi-agent systems , 1997 .

[15] Onn Shehory,et al. Anytime Coalition Structure Generation with Worst Case Guarantees , 1998, AAAI/IAAI.

[16] Yicheng Zhang. Modeling Market Mechanism with Evolutionary Games , 1998, cond-mat/9803308.

[17] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[18] P. M. Hui,et al. Volatility and agent adaptability in a self-organizing market , 1998, cond-mat/9802177.

[19] Yicheng Zhang,et al. On the minority game: Analytical and numerical studies , 1998, cond-mat/9805084.

[20] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[21] Kagan Tumer,et al. Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.

[22] K. P. Sycara. Multiagent systems : Special issue on agents , 1998 .

[23] Kagan Tumer,et al. General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[24] Kagan Tumer,et al. Collective Intelligence for Control of Distributed Dynamical Systems , 1999, ArXiv.

[25] Craig Boutilier. Multiagent Systems: Challenges and Opportunities for Decision-Theoretic Planning , 1999, AI Mag..

[26] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.

[27] Gerhard Weiss,et al. Multiagent Systems , 1999 .

[28] Kagan Tumer,et al. Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.

[29] Kagan Tumer,et al. Improving Simulated Annealing by Recasting it as a Non-Cooperative Game , 2001 .