Learning Through Reinforcement and Replicator Dynamics

This paper considers a version of Bush and Mosteller's stochastic learning theory in the context of games. We compare this model of learning to a model of biological evolution. The purpose is to investigate analogies between learning and evolution. We and that in the continuous time limit the biological model coincides with the deterministic, continuous time replicator process. We give conditions under which the same is true for the learning model. For the case that these conditions do not hold, we show that the replicator process continues to play an important role in characterising the continuous time limit of the learning model, but that a di®erent e®ect (\Probability Matching") enters as well.(This abstract was borrowed from another version of this item.)

[1]  R. R. Bush,et al.  A Mathematical Model for Simple Learning , 1951 .

[2]  Frederick Mosteller,et al.  Stochastic Models for Learning , 1956 .

[3]  S. Siegel DECISION MAKING AND LEARNING UNDER VARYING CONDITIONS OF REINFORCEMENT * , 1961 .

[4]  W. Estes Markov learning models for multiperson interactions, by Patrick Suppes and Richard C. Atkinson, Stanford University Press, Stanford, California, 1960, 296 pp., $8.25 , 1961 .

[5]  M. Norman Some convergence theorems for stochastic learning models with distance diminishing operators , 1968 .

[6]  J. Cross A Stochastic Learning Model of Economic Behavior , 1973 .

[7]  Richard Schmalensee,et al.  Alternative models of bandit selection , 1975 .

[8]  A. Kuehn Consumer Brand Choice--A Learning Process? , 1976 .

[9]  P. Taylor,et al.  Evolutionarily Stable Strategies and Game Dynamics , 1978 .

[10]  Moshe Givon,et al.  Application of a Composite Stochastic Model of Brand Choice , 1979 .

[11]  P. Taylor Evolutionarily stable strategies with two types of player , 1979, Journal of Applied Probability.

[12]  S. Lakshmivarahan,et al.  Learning Algorithms for Two-Person Zero-Sum Stochastic Games with Incomplete Information , 1981, Math. Oper. Res..

[13]  K. Narendra,et al.  Learning Algorithms for Two-Person Zero-Sum Stochastic Games with Incomplete Information: A Unified Approach , 1982 .

[14]  J. Cross A theory of adaptive economic behavior , 1983 .

[15]  E. Akin,et al.  Evolutionary dynamics of zero-sum games , 1984, Journal of mathematical biology.

[16]  J. Butcher The Numerical Analysis of Ordinary Di erential Equa-tions , 1986 .

[17]  Josef Hofbauer,et al.  The theory of evolution and dynamical systems , 1988 .

[18]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[19]  T. S. Robertson,et al.  Handbook of Consumer Behavior , 1990 .

[20]  L. Samuelson,et al.  EVOLUTIONARY STABILITY IN SYMMETRIC GAMES , 1990 .

[21]  Franz J. Weissing,et al.  Evolutionary stability and dynamic stability in a class of evolutionary normal form games , 1991 .

[22]  Suzanne Scotchmer,et al.  On the evolution of optimizing behavior , 1991 .

[23]  Richard T. Boylan Laws of large numbers for dynamical systems with randomly matched individuals , 1992 .

[24]  L. Samuelson,et al.  Evolutionary Stability in Asymmetric Games , 1992 .

[25]  J. Sobel,et al.  On the limit points of discrete selection dynamics , 1992 .

[26]  I. Gilboa,et al.  A Model of Random Matching , 1992 .

[27]  V. V. Phansalkar,et al.  Absolutely expedient algorithms for learning Nash equilibria , 1994 .

[28]  Ken Binmore,et al.  Muddling Through: Noisy Equilibrium Selection☆ , 1997 .

[29]  V. V. Phansalkar,et al.  Decentralized Learning of Nash Equilibria in Multi-Person Stochastic Games With Incomplete Information , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[30]  W. Estes Toward a Statistical Theory of Learning. , 1994 .

[31]  Dilip Mookherjee,et al.  Learning behavior in an experimental matching pennies game , 1994 .

[32]  Richard T. Boylan Continuous Approximation of Dynamical Systems with Randomly Matched Individuals , 1995 .

[33]  A. Roth,et al.  Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term* , 1995 .

[34]  J. Weibull,et al.  Evolutionary Selection in Normal-Form Games , 1995 .

[35]  K. Schlag Why Imitate, and If So, How?, : A Boundedly Rational Approach to Multi-armed Bandits , 1998 .