Reinforcement learning in population games

We study reinforcement learning in a population game. Agents in a population game revise mixed strategies using the Cross rule of reinforcement learning. The population state—the probability distribution over the set of mixed strategies—evolves according to the replicator continuity equation which, in its simplest form, is a partial differential equation. The replicator dynamic is a special case in which the initial population state is homogeneous, i.e. when all agents use the same mixed strategy. We apply the continuity dynamic to various classes of symmetric games. Using 3×3 coordination games, we show that equilibrium selection depends on the variance of the initial strategy distribution, or initial population heterogeneity. We give an example of a 2×2 game in which heterogeneity persists even as the mean population state converges to a mixed equilibrium. Finally, we apply the dynamic to negative definite and doubly symmetric games.

[1]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[2]  P. Hartman Ordinary Differential Equations , 1965 .

[3]  Tilman Börgers,et al.  Learning Through Reinforcement and Replicator Dynamics , 1997 .

[4]  E. Hopkins Learning, Matching and Aggregation , 1995 .

[5]  H. Peyton Young,et al.  Strategic Learning and Its Limits , 2004 .

[6]  Drew Fudenberg,et al.  Heterogeneous beliefs and local information in stochastic fictitious play , 2011, Games Econ. Behav..

[7]  Daniel N. Ostrov,et al.  Conspicuous consumption dynamics , 2008, Games Econ. Behav..

[8]  P. Taylor,et al.  Evolutionarily Stable Strategies and Game Dynamics , 1978 .

[9]  J M Smith,et al.  Evolution and the theory of games , 1976 .

[10]  Tilman Börgers,et al.  Naive Reinforcement Learning With Endogenous Aspirations , 2000 .

[11]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[12]  Josef Hofbauer,et al.  Time Average Replicator and Best-Reply Dynamics , 2009, Math. Oper. Res..

[13]  A. Roth,et al.  Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .

[14]  J. Cross A Stochastic Learning Model of Economic Behavior , 1973 .

[15]  E. Zeeman Dynamics of the evolution of animal conflicts , 1981 .

[16]  Daniel N. Ostrov,et al.  Gradient dynamics in population games: Some basic results , 2010 .

[17]  Tilman Börgers,et al.  Naive Reinforcement Learning With Endogenous Aspiration , 1998 .

[18]  Josef Hofbauer,et al.  Evolution in games with randomly disturbed payoffs , 2007, J. Econ. Theory.

[19]  W. G. S. Hines,et al.  Strategy stability in complex populations , 1980, Journal of Applied Probability.

[20]  Josef Hofbauer,et al.  Evolutionary Games and Population Dynamics , 1998 .

[21]  Antonio J. Morales,et al.  Expedient and Monotone Learning Rules , 2004 .

[22]  K. Parthasarathy,et al.  Probability measures on metric spaces , 1967 .

[23]  Drew Fudenberg,et al.  Learning Purified Equilibria , 2000 .

[24]  Robert M. Seymour,et al.  Fictitious play in an evolutionary environment , 2010, Games Econ. Behav..

[25]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[26]  William H. Sandholm,et al.  Potential Games with Continuous Player Sets , 2001, J. Econ. Theory.

[27]  William H. Sandholm,et al.  Evolution in Bayesian games I: Theory , 2005, Games Econ. Behav..

[28]  Josef Hofbauer,et al.  The theory of evolution and dynamical systems , 1988 .

[29]  William H. Sandholm,et al.  Evolution in Bayesian games II: Stability of purified equilibria , 2007, J. Econ. Theory.

[30]  Drew Fudenberg,et al.  Learning Purified Mixed Equilibria , 2000, J. Econ. Theory.

[31]  R. Ho Algebraic Topology , 2022 .

[32]  J. Schwartz,et al.  Linear Operators. Part I: General Theory. , 1960 .

[33]  Henry Margenau,et al.  The mathematics of physics and chemistry , 1943 .

[34]  Josef Hofbauer,et al.  Stable games and their dynamics , 2009, J. Econ. Theory.