Conjectural Equilibrium in Multiagent Learning

Learning in a multiagent environment is complicated by the fact that as other agents learn, the environment effectively changes. Moreover, other agents' actions are often not directly observable, and the actions taken by the learning agent can strongly bias which range of behaviors are encountered. We define the concept of a conjectural equilibrium, where all agents' expectations are realized, and each agent responds optimally to its expectations. We present a generic multiagent exchange situation, in which competitive behavior constitutes a conjectural equilibrium. We then introduce an agent that executes a more sophisticated strategic learning strategy, building a model of the response of other agents. We find that the system reliably converges to a conjectural equilibrium, but that the final result achieved is highly sensitive to initial belief. In essence, the strategic learner's actions tend to fulfill its expectations. Depending on the starting point, the agent may be better or worse off than had it not attempted to learn a model of the other agents at all.

[1]  Hogg,et al.  Dynamics of computational ecosystems. , 1989, Physical review. A, General physics.

[2]  D. Sattinger,et al.  Calculus on Manifolds , 1986 .

[3]  Edmund H. Durfee,et al.  Learning nested agent models in an information economy , 1998, J. Exp. Theor. Artif. Intell..

[4]  Junling Hu,et al.  Self-fulfilling Bias in Multiagent Learning , 1996 .

[5]  Karl C. Samples,et al.  A Note On The Existence Of Starting Point Bias In Iterative Bidding Games , 1985 .

[6]  Sandip Sen,et al.  Evolution and learning in multiagent systems , 1998, Int. J. Hum. Comput. Stud..

[7]  Michael P. Wellman,et al.  A Simple Computational Market for Network Information Services , 1995, ICMAS.

[8]  Craig Boutilier,et al.  Learning Conventions in Multiagent Stochastic Domains using Likelihood Estimates , 1996, UAI.

[9]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[10]  J. Munkres,et al.  Calculus on Manifolds , 1965 .

[11]  Tuomas Sandholm,et al.  On the Gains and Losses of Speculation in Equilibrium Markets , 1997, IJCAI.

[12]  Sandip Sen IJCAI-95 Workshop on Adaptation and Learning in Multiagent Systems , 1996 .

[13]  Gerhard Weiss,et al.  Learning to Coordinate Actions in Multi-Agent-Systems , 1993, IJCAI.

[14]  D. Fudenberg,et al.  Self-confirming equilibrium , 1993 .

[15]  Frank Hahn,et al.  EXERCISES IN CONJECTURAL EQUILIBRIA , 1977 .

[16]  R. Gibbons Game theory for applied economists , 1992 .

[17]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[18]  Edmund H. Durfee,et al.  Agents Learning about Agents: A Framework and Analysis , 1997 .

[19]  Michael P. Wellman,et al.  Online learning about other agents in a dynamic multiagent system , 1998, AGENTS '98.

[20]  Kenji Fukumoto,et al.  Multi-agent Reinforcement Learning: A Modular Approach , 1996 .

[21]  Richard C. Bishop,et al.  Starting Point Bias in Contingent Valuation Bidding Games , 1984 .

[22]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[23]  Paul R. Milgrom,et al.  Adaptive and sophisticated learning in normal form games , 1991 .

[24]  Michael P. Wellman A computational market model for distributed configuration design , 1994, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[25]  Michael P. Wellman A Market-Oriented Programming Environment and its Application to Distributed Multicommodity Flow Problems , 1993, J. Artif. Intell. Res..

[26]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[27]  J. Shoven,et al.  Applying general equilibrium , 1993 .

[28]  K. Arrow,et al.  Capital-labor substitution and economic efficiency , 1961 .

[29]  Moshe Tennenholtz,et al.  On the Emergence of Social Conventions: Modeling, Analysis, and Simulations , 1997, Artif. Intell..

[30]  I. Gilboa,et al.  Social Stability and Equilibrium , 1991 .

[31]  H. Young The Economics of Convention , 1996 .

[32]  Leo Liberti,et al.  Introduction to Global Optimization , 2006 .

[33]  J. Filar,et al.  Competitive Markov Decision Processes , 1996 .

[34]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[35]  Michael P. Wellman,et al.  The WALRAS Algorithm: A Convergent Distributed Implementation of General Equilibrium Outcomes , 1998 .

[36]  T. Negishi THE STABILITY OF A COMPETITIVE ECONOMY: A SURVEY ARTICLE , 1962 .

[37]  Adam Brandenburger,et al.  Knowledge and Equilibrium in Games , 1992 .