Theoretical and Experimental Analysis of an Evolutionary Social-Learning Game

An important way to learn new actions and behaviors is by observing others, and several evolutionary games have been developed to investigate what learning strategies work best and how they might have evolved. In this paper we present an extensive set of mathematical and simulation results for Cultaptation, which is one of the best-known such games. We derive a formula for measuring a strategy’s expected reproductive success, provide algorithms to compute near-best-response strategies and near-Nash equilibria, and provide techniques for efficient implementation of those algorithms. Our experimental studies provide strong evidence for the following hypotheses: 1. The best strategies for Cultaptation and similar games are likely to be conditional ones in which the choice of action at each round is conditioned on the agent’s accumulated experience. Such strategies (or close approximations of them) can be computed by doing a lookahead search that predicts how each possible choice of action at the current round is likely to affect future performance. 2. Such strategies are likely to exploit most of the time, but will have ways of quickly detecting structural shocks, so that they can switch quickly to innovation in order to learn how to respond to such shocks. This conflicts with the conventional wisdom that successful social-learning strategies are characterized by a high frequency of innovation; and agrees with recent experiments by others on human subjects that also challenge the conventional wisdom.

[1]  T. Valone,et al.  Potential disadvantages of using socially acquired information. , 2002, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[2]  John N. Tsitsiklis,et al.  The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..

[3]  Austin Parker,et al.  Balancing Innovation and Exploitation in a Social Learning Game , 2008, AAAI Fall Symposium: Adaptive Agents in Cultural Contexts.

[4]  Ellis Horowitz,et al.  Fundamentals of Computer Algorithms , 1978 .

[5]  Craig Boutilier,et al.  Exploiting Structure in Policy Construction , 1995, IJCAI.

[6]  H. Roche,et al.  Why Copy Others? Insights from the Social Learning Strategies Tournament , 2010 .

[7]  P. Todd,et al.  Explaining social learning of food preferences without aversions: an evolutionary simulation model of Norway rats , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[8]  K. Laland,et al.  Social Learning in Animals: Empirical Studies and Theoretical Models , 2005 .

[9]  Magnus Enquist,et al.  Social Learning : A Solution to Rogers ’ s Paradox of Nonadaptive Culture , 2007 .

[10]  Toshihide Ibaraki,et al.  Theoretical comparisons of search strategies in branch-and-bound algorithms , 1976, International Journal of Computer & Information Sciences.

[11]  Daphne Koller,et al.  Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.

[12]  A. Plaat An Algorithm Faster than NegaScout and SSS * in Practice , 1998 .

[13]  Robert L. Goldstone,et al.  Social Learning and Cumulative Innovations in a Networked Group , 2010, SBP.

[14]  J. Henrich,et al.  The evolution of cultural evolution , 2003 .

[15]  Yoav Shoham,et al.  Essentials of Game Theory: A Concise Multidisciplinary Introduction , 2008, Essentials of Game Theory: A Concise Multidisciplinary Introduction.

[16]  Daisuke Nakanishi,et al.  Cost–benefit analysis of social/cultural learning in a nonstationary uncertain environment: An evolutionary simulation and an experiment with human subjects , 2002 .

[17]  Luke Rendell,et al.  Nine-spined sticklebacks deploy a hill-climbing social learning strategy , 2009 .

[18]  Timothy M. Waring,et al.  Article in Press Evolution and Human Behavior Xxx (2005) Xxx – Xxx , 2022 .

[19]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[20]  Kevin N. Laland,et al.  Size-dependent directed social learning in nine-spined sticklebacks , 2009, Animal Behaviour.

[21]  R. Bellman,et al.  Dynamic Programming and Markov Processes , 1960 .

[22]  A. S. Xanthopoulos,et al.  Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems , 2008, Appl. Math. Comput..

[23]  Peng Shi,et al.  Approximation algorithms for restless bandit problems , 2007, JACM.

[24]  Eran A. Guse Expectational Business Cycles , 2004 .

[25]  HighWire Press Philosophical Transactions of the Royal Society of London , 1781, The London Medical Journal.

[26]  Alan R. Rogers,et al.  Does Biology Constrain Culture , 1988 .

[27]  Austin Parker,et al.  When Innovation Matters: An Analysis of Innovation in a Social Learning Game. , 2008 .

[28]  S. Thompson Social Learning Theory , 2008 .

[29]  K. Schlag Why Imitate, and If So, How?, : A Boundedly Rational Approach to Multi-armed Bandits , 1998 .

[30]  K. Laland Social learning strategies , 2004, Learning & behavior.

[31]  L. Rapaport,et al.  Social influences on foraging behavior in young nonhuman primates: Learning what, where, and how to eat , 2008 .

[32]  D. Friedman On economic applications of evolutionary game theory , 1998 .

[33]  J. Hofbauer,et al.  Evolutionary game dynamics , 2011 .

[34]  Peter J. Richerson,et al.  Why does culture increase human adaptability , 1995 .

[35]  Thomas R. Zentall,et al.  Imitation: definitions, evidence, and mechanisms , 2006, Animal Cognition.

[36]  Daniel Nettle,et al.  Language: Costs and benefits of a specialised system for social information transmission , 2005 .

[37]  R. Sibly,et al.  Producers and scroungers: A general model and its application to captive flocks of house sparrows , 1981, Animal Behaviour.

[38]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .