Learning to signal: Analysis of a micro-level reinforcement model

We consider the following signaling game. Nature plays first from the set {1,2}. Player 1 (the Sender) sees this and plays from the set {A,B}. Player 2 (the Receiver) sees only Player 1's play and plays from the set {1,2}. Both players win if Player 2's play equals Nature's play and lose otherwise. Players are told whether they have won or lost, and the game is repeated. An urn scheme for learning coordination in this game is as follows. Each node of the decision tree for Players 1 and 2 contains an urn with balls of two colors for the two possible decisions. Players make decisions by drawing from the appropriate urns. After a win, each ball that was drawn is reinforced by adding another of the same color to the urn. A number of equilibria are possible for this game other than the optimal ones. However, we show that the urn scheme achieves asymptotically optimal coordination.

[1]  Robert van Rooij,et al.  The Stag Hunt and the Evolution of Social Structure , 2007, Stud Logica.

[2]  Stanislav Volkov,et al.  Vertex-reinforced random walk on Z has finite range , 1999 .

[3]  J. McKenzie Alexander,et al.  The Structural Evolution of Morality , 2008 .

[4]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[5]  B Skyrms,et al.  A dynamic model of social network formation. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[6]  R. N. Bradt,et al.  On Sequential Designs for Maximizing the Sum of $n$ Observations , 1956 .

[7]  L. J. Wei,et al.  The Randomized Play-the-Winner Rule in Medical Trials , 1978 .

[8]  Robin Pemantle,et al.  Time to absorption in discounted reinforcement models , 2004 .

[9]  V. Limic Attracting edge property for a class of reinforced random walks , 2003 .

[10]  A survey of random processes with reinforcement , 2007, math/0610076.

[11]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[12]  M. Macy,et al.  Stochastic Collusion and the Power Law of Learning , 2002 .

[13]  M. Benaïm Dynamics of stochastic approximation algorithms , 1999 .

[14]  S. Volkov Vertex-reinforced random walk on arbitrary graphs , 1999, math/9907196.

[15]  P. Taylor,et al.  Evolutionarily Stable Strategies and Game Dynamics , 1978 .

[16]  B. Davis,et al.  Reinforced random walk , 1990 .

[17]  M. Hirsch,et al.  Dynamics of Morse-Smale urn processes , 1995, Ergodic Theory and Dynamical Systems.

[18]  G A Parker,et al.  Evolutionary Stable Strategies , 1984, Encyclopedia of Evolutionary Psychological Science.

[19]  R. Pemantle,et al.  Nonconvergence to Unstable Points in Urn Models and Stochastic Approximations , 1990 .

[20]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[21]  P. Bonacich,et al.  Asymptotics of a matrix valued Markov chain arising in sociology , 2003 .

[22]  R. Durrett Probability: Theory and Examples , 1993 .