Title Network formation by reinforcement learning : The long and medium run Permalink

We investigate a simple stochastic model of social network formation by the process of reinforcement learning with discounting of the past. In the limit, for any value of the discounting parameter, small, stable cliques are formed. However, the time it takes to reach the limiting state in which cliques have formed is very sensitive to the discounting parameter. Depending on this value, the limiting result may or may not be a good predictor for realistic observation times.

[1]  R. Herrnstein On the law of effect. , 1970, Journal of the experimental analysis of behavior.

[2]  A. Ianni,et al.  Reinforcement learning and the power law of practice: some analytical results , 2002 .

[3]  M. Frank Norman,et al.  Markov Processes and Learning Models , 2012 .

[4]  A. Roth,et al.  Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term* , 1995 .

[5]  B Skyrms,et al.  A dynamic model of social network formation. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[6]  D. E. Matthews Evolution and the Theory of Games , 1977 .

[7]  Alan W. Beggs,et al.  On the convergence of reinforcement learning , 2005, J. Econ. Theory.

[8]  M. Macy Learning to Cooperate: Stochastic and Tacit Collusion in Social Exchange , 1991, American Journal of Sociology.

[9]  R. Pemantle Vertex-reinforced random walk , 1992, math/0404041.

[10]  J. Busemeyer,et al.  A contribution of cognitive decision models to clinical assessment: decomposing performance on the Bechara gambling task. , 2002, Psychological assessment.

[11]  M. Hirsch,et al.  Dynamics of Morse-Smale urn processes , 1995, Ergodic Theory and Dynamical Systems.

[12]  Robin Pemantle,et al.  Time to absorption in discounted reinforcement models , 2004 .

[13]  A. Roth,et al.  Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .

[14]  V. Limic Attracting edge property for a class of reinforced random walks , 2003 .

[15]  Klaus Krickeberg,et al.  Markov learning models for multiperson interactions , 1962 .

[16]  Bereby-Meyer,et al.  On Learning To Become a Successful Loser: A Comparison of Alternative Abstractions of Learning Processes in the Loss Domain. , 1998, Journal of mathematical psychology.

[17]  Hans G. Othmer,et al.  Aggregation, Blowup, and Collapse: The ABC's of Taxis in Reinforced Random Walks , 1997, SIAM J. Appl. Math..

[18]  J. Townsend,et al.  Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. , 1993, Psychological review.

[19]  M. Benaïm Recursive algorithms, urn processes and chaining number of chain recurrent sets , 1998, Ergodic Theory and Dynamical Systems.

[20]  R. Duncan Luce,et al.  Individual Choice Behavior , 1959 .

[21]  Michael W. Macy,et al.  Learning Theory and the Logic of Critical Mass , 1990 .

[22]  Tilman Börgers,et al.  Learning Through Reinforcement and Replicator Dynamics , 1997 .

[23]  Radu Theodorescu,et al.  Random processes and learning , 1969 .

[24]  G. Pólya,et al.  Über die Statistik verketteter Vorgänge , 1923 .

[25]  P. Bonacich,et al.  Asymptotics of a matrix valued Markov chain arising in sociology , 2003 .

[26]  Robin Pemantle,et al.  Phase transition in reinforced random walk and RWRE on trees , 1988 .

[27]  E. Seneta,et al.  On Quasi-Stationary distributions in absorbing discrete-time finite Markov chains , 1965, Journal of Applied Probability.

[28]  M. Benaïm Dynamics of stochastic approximation algorithms , 1999 .

[29]  B. Davis,et al.  Reinforced random walk , 1990 .

[30]  W. Estes Toward a Statistical Theory of Learning. , 1994 .

[31]  Frederick Mosteller,et al.  Stochastic Models for Learning , 1956 .