论文信息 - The design of collectives of agents to control non-Markovian systems - 字舞流文

The design of collectives of agents to control non-Markovian systems

The "Collective Intelligence" (COIN) framework concerns the design of collectives of reinforcement-learning agents such that their interaction causes a provided "world" utility function concerning the entire collective to be maximized. Previously, we applied that framework to scenarios involving Markovian dynamics where no re-evolution of the system from counter-factual initial conditions (an often expensive calculation) is permitted. This approach sets the individual utility function of each agent to be both aligned with the world utility, and at the same time, easy for the associated agents to optimize. Here we extend that approach to systems involving non-Markovian dynamics. In computer simulations, we compare our techniques with each other and with conventional "team games" We show whereas in team games performance often degrades badly with time, it steadily improves when our techniques are used. We also investigate situations where the system's dimensionality is effectively reduced. We show that this leads to difficulties in the agents' ability to learn. The implication is that "learning" is a property only of high-enough dimensional systems.

David H. Wolpert | John W. Lawson | D. Wolpert | J. Lawson

[1] Kagan Tumer,et al. Collective Intelligence for Control of Distributed Dynamical Systems , 1999, ArXiv.

[2] Kagan Tumer,et al. Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[3] Yicheng Zhang. Modeling Market Mechanism with Evolutionary Games , 1998, cond-mat/9803308.

[4] Onn Shehory,et al. Anytime Coalition Structure Generation with Worst Case Guarantees , 1998, AAAI/IAAI.

[5] Craig Boutilier,et al. Economic Principles of Multi-Agent Systems , 1997, Artif. Intell..

[6] M. Marsili,et al. A Prototype Model of Stock Exchange , 1997, cond-mat/9709118.

[7] P. M. Hui,et al. Volatility and agent adaptability in a self-organizing market , 1998, cond-mat/9802177.

[8] David H. Wolpert,et al. Designing agent collectives for systems with markovian dynamics , 2002, AAMAS '02.

[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[10] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[11] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[12] K. Sycara,et al. This Is a Publication of the American Association for Artificial Intelligence Multiagent Systems Multiagent System Issues and Challenges Individual Agent Reasoning Task Allocation Multiagent Planning Recognizing and Resolving Conflicts Managing Communication Modeling Other Agents Managing Resources , 2022 .

[13] Gerhard Weiss,et al. Multiagent Systems , 1999 .

[14] Y. Shoham,et al. Editorial: economic principles of multi-agent systems , 1997 .

[15] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.

[16] Jeffrey M. Bradshaw,et al. Software agents , 1997 .

[17] Michael P. Wellman. A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[18] A. Roadmapof. A Roadmap of Agent Research and Development , 1995 .

[19] Kagan Tumer,et al. Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.

[20] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[21] Kagan Tumer,et al. Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.