Chaos in learning a simple two-person game

We investigate the problem of learning to play the game of rock–paper–scissors. Each player attempts to improve her/his average score by adjusting the frequency of the three possible responses, using reinforcement learning. For the zero sum game the learning process displays Hamiltonian chaos. Thus, the learning trajectory can be simple or complex, depending on initial conditions. We also investigate the non-zero sum case and show that it can give rise to chaotic transients. This is, to our knowledge, the first demonstration of Hamiltonian chaos in learning a basic two-person game, extending earlier findings of chaotic attractors in dissipative systems. As we argue here, chaos provides an important self-consistency condition for determining when players will learn to behave as though they were fully rational. That chaos can occur in learning a simple game indicates one should use caution in assuming real people will learn to play a game according to a Nash equilibrium strategy.

[1]  G. E. Macfarlane,et al.  Children's games in street and playground , 1969 .

[2]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[3]  P. Taylor,et al.  Evolutionarily Stable Strategies and Game Dynamics , 1978 .

[4]  P. Taylor Evolutionarily stable strategies with two types of player , 1979, Journal of Applied Probability.

[5]  A. Lichtenberg,et al.  Regular and Stochastic Motion , 1982 .

[6]  Farmer,et al.  Predicting chaotic time series. , 1987, Physical review letters.

[7]  Josef Hofbauer,et al.  The theory of evolution and dynamical systems , 1988 .

[8]  H. Yoshida Construction of higher order symplectic integrators , 1990 .

[9]  David M. Kreps,et al.  Game Theory and Economic Modelling , 1992 .

[10]  M. Nowak,et al.  Chaos and the evolution of cooperation. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Tsuyoshi Chawanya A New Type of Irregular Motion in a Class of Game Dynamics Systems , 1994, chao-dyn/9409005.

[12]  J. Hofbauer Evolutionary dynamics for bimatrix games: A Hamiltonian system? , 1996, Journal of mathematical biology.

[13]  Tilman Börgers,et al.  Learning Through Reinforcement and Replicator Dynamics , 1997 .

[14]  Robin P. Cubitt,et al.  The Selection of Preferences Through Imitation , 1998 .

[15]  H P Young,et al.  On the impossibility of predicting the behavior of rational agents , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  P. Schuster,et al.  Selfregulation of behaviour in animal societies , 1981, Biological Cybernetics.