Designing agent collectives for systems with markovian dynamics

The "Collective Intelligence" (COIN) framework concerns the design of collectives of agents so that as those agents strive to maximize their individual utility functions, their interaction causes a provided "world" utility function concerning the entire collective to be also maximized. Here we show how to extend that framework to scenarios having Markovian dynamics when no re-evolution of the system from counter-factual initial conditions (an often expensive calculation) is permitted. Our approach transforms the(time-extended) argument of each agent's utility function before evaluating that function. This transformation has benefits in scenarios not involving Markovian dynamics, in particular scenarios where not all of the arguments of an agent's utility function are observable. We investigate this transformation in simulations involving both linear and quadratic (nonlinear) dynamics. In addition, we find that a certain subset of these transformations, which result in utilities that have low "opacity (analogous to having high signal to noise) but are not "factored" (analogous to not being incentive compatible), reliably improve performance over that arising with factored utilities. We also present a Taylor Series method for the fully general nonlinear case.

[1]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[2]  Nicholas R. Jennings,et al.  A Roadmap of Agent Research and Development , 2004, Autonomous Agents and Multi-Agent Systems.

[3]  Craig Boutilier,et al.  Economic Principles of Multi-Agent Systems , 1997, Artif. Intell..

[4]  Yicheng Zhang,et al.  On the minority game: Analytical and numerical studies , 1998, cond-mat/9805084.

[5]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[6]  Michael R. Genesereth,et al.  Software agents , 1994, CACM.

[7]  Kagan Tumer,et al.  Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.

[8]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[9]  David Elkind,et al.  Learning: An Introduction , 1968 .

[10]  Craig Boutilier Multiagent Systems: Challenges and Opportunities for Decision-Theoretic Planning , 1999, AI Mag..

[11]  Michael P. Wellman A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[12]  Michael Wooldridge,et al.  Autonomous agents and multi-agent systems , 2014 .

[13]  W. Arthur Complexity in economic theory: inductive reasoning and bounded rationality , 1994 .

[14]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[15]  Onn Shehory,et al.  Anytime Coalition Structure Generation with Worst Case Guarantees , 1998, AAAI/IAAI.

[16]  Kagan Tumer,et al.  Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.

[17]  M. Marsili,et al.  A Prototype Model of Stock Exchange , 1997, cond-mat/9709118.

[18]  P. M. Hui,et al.  Volatility and agent adaptability in a self-organizing market , 1998, cond-mat/9802177.

[19]  Kagan Tumer,et al.  Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[20]  Gerhard Weiss,et al.  Multiagent Systems , 1999 .

[21]  Yicheng Zhang Modeling Market Mechanism with Evolutionary Games , 1998, cond-mat/9803308.

[22]  Kagan Tumer,et al.  General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[23]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[24]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[25]  J. Davenport Editor , 1960 .

[26]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[27]  Y. Shoham,et al.  Editorial: economic principles of multi-agent systems , 1997 .

[28]  Kagan Tumer,et al.  An Introduction to Collective Intelligence , 1999, ArXiv.

[29]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[30]  Kagan Tumer,et al.  Collective Intelligence for Control of Distributed Dynamical Systems , 1999, ArXiv.