Meta-control of social learning strategies

Social learning, copying other’s behavior without actual experience, offers a cost-effective means of knowledge acquisition. However, it raises the fundamental question of which individuals have reliable information: successful individuals versus the majority. The former and the latter are known respectively as success-based and conformist social learning strategies. We show here that while the success-based strategy fully exploits the benign environment of low uncertainly, it fails in uncertain environments. On the other hand, the conformist strategy can effectively mitigate this adverse effect. Based on these findings, we hypothesized that meta-control of individual and social learning strategies provides effective and sample-efficient learning in volatile and uncertain environments. Simulations on a set of environments with various levels of volatility and uncertainty confirmed our hypothesis. The results imply that meta-control of social learning affords agents the leverage to resolve environmental uncertainty with minimal exploration cost, by exploiting others’ learning as an external knowledge base.

[1]  J. Hofbauer,et al.  Evolutionary game dynamics , 2011 .

[2]  T. Kameda,et al.  Human Collective Intelligence under Dual Exploration-Exploitation Dilemmas , 2014, PloS one.

[3]  H. Bradley Understanding Contemporary Society: Theories of the Present , 1999 .

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  H. Roche,et al.  Why Copy Others? Insights from the Social Learning Strategies Tournament , 2010 .

[6]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[7]  Lida Xu,et al.  The internet of things: a survey , 2014, Information Systems Frontiers.

[8]  Sang Wan Lee,et al.  Why and how the brain weights contributions from a mixture of experts , 2021, Neuroscience & Biobehavioral Reviews.

[9]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[10]  Marcus W. Feldman,et al.  Cultural evolution of conformity and anticonformity , 2020, Proceedings of the National Academy of Sciences.

[11]  Liane Gabora,et al.  An evolutionary framework for cultural change: selectionism versus communal exchange. , 2012, Physics of life reviews.

[12]  Karl J. Friston,et al.  Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning , 2004, Science.

[13]  Radhika Nagpal,et al.  Multi-Feature Collective Decision Making in Robot Swarms , 2018, AAMAS.

[14]  J. Henrich The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter , 2015 .

[15]  A. E. Eiben,et al.  Embodied, On-line, On-board Evolution for Autonomous Robotics , 2010 .

[16]  Kevin N Laland,et al.  Tradeoffs between the strength of conformity and number of conformists in variable environments. , 2013, Journal of theoretical biology.

[17]  J M Smith,et al.  Evolution and the theory of games , 1976 .

[18]  Radhika Nagpal,et al.  Programmable self-assembly in a thousand-robot swarm , 2014, Science.

[19]  K. Laland,et al.  Social learning strategies and predation risk: minnows copy only when using private information would be costly , 2008, Proceedings of the Royal Society B: Biological Sciences.

[20]  John Ferejohn,et al.  Rational Choice and Social Theory , 1994 .

[21]  Kevin N Laland,et al.  Human cumulative culture: a comparative perspective , 2014, Biological reviews of the Cambridge Philosophical Society.

[22]  Mykola Pechenizkiy,et al.  Limited evaluation cooperative co-evolutionary differential evolution for large-scale neuroevolution , 2018, GECCO.

[23]  M. Nowak Evolutionary Dynamics: Exploring the Equations of Life , 2006 .

[24]  Mario Baum,et al.  Culture And The Evolutionary Process , 2016 .

[25]  Risto Miikkulainen,et al.  Designing neural networks through neuroevolution , 2019, Nat. Mach. Intell..

[26]  Kenichi Aoki,et al.  The Emergence of Social Learning in a Temporally Changing Environment: A Theoretical Model1 , 2005, Current Anthropology.

[27]  Luke Rendell,et al.  Social Learning Strategies: Bridge-Building between Fields , 2018, Trends in Cognitive Sciences.

[28]  Joel Z. Leibo,et al.  Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research , 2019, ArXiv.

[29]  Joel Z. Leibo,et al.  Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.

[30]  C. Heyes Who Knows? Metacognitive Social Learning Strategies , 2016, Trends in Cognitive Sciences.

[31]  Roderich Groß,et al.  Simple learning rules to cope with changing environments , 2008, Journal of The Royal Society Interface.

[32]  A. S. Xanthopoulos,et al.  Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems , 2008, Appl. Math. Comput..

[33]  Wataru Nakahashi,et al.  The evolution of conformist transmission in social learning when the environment changes periodically. , 2007, Theoretical population biology.

[34]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[35]  Wenguo Liu,et al.  Environment-driven distributed evolutionary adaptation in a population of autonomous robotic agents , 2012 .

[36]  Andrew Whiten,et al.  The burgeoning reach of animal culture , 2021, Science.

[37]  Igor Mordatch,et al.  Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.

[38]  Wolfgang M. Pauli,et al.  Neural computations underlying inverse reinforcement learning in the human brain , 2017, eLife.

[39]  K. Laland Social learning strategies , 2004, Learning & behavior.

[40]  R. Boyd,et al.  The evolution of conformist transmission and the emergence of between-group differences. , 1998 .

[41]  K. Laland,et al.  The evolutionary basis of human social learning , 2012, Proceedings of the Royal Society B: Biological Sciences.

[42]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[43]  Sang Wan Lee,et al.  Task complexity interacts with state-space uncertainty in the arbitration between model-based and model-free learning , 2019, Nature Communications.

[44]  Natalia L Komarova,et al.  Replicator-mutator equation, universality property and population dynamics of learning. , 2004, Journal of theoretical biology.

[45]  Xinyu Yang,et al.  A Survey on Internet of Things: Architecture, Enabling Technologies, Security and Privacy, and Applications , 2017, IEEE Internet of Things Journal.

[46]  J. O'Doherty,et al.  A Neuro-computational Account of Arbitration between Choice Imitation and Goal Emulation during Human Observational Learning , 2020, Neuron.

[47]  Cecilia Heyes,et al.  When does social learning become cultural learning? , 2017, Developmental science.

[48]  Sang Wan Lee,et al.  The structure of reinforcement-learning mechanisms in the human brain , 2015, Current Opinion in Behavioral Sciences.

[49]  Joel Z. Leibo,et al.  Toward high-performance, memory-efficient, and fast reinforcement learning—Lessons from decision neuroscience , 2019, Science Robotics.

[50]  Yoshimichi Sato Rational choice theory , 2013 .

[51]  K. Schlag Why Imitate, and If So, How?, : A Boundedly Rational Approach to Multi-armed Bandits , 1998 .

[52]  Abraham Prieto,et al.  Embodied Evolution in Collective Robotics: A Review , 2017, Front. Robot. AI.

[53]  Sebastian Musslick,et al.  Meta-control: From psychology to computational neuroscience , 2021, Cognitive, Affective, & Behavioral Neuroscience.

[54]  Luc-Alain Giraldeau,et al.  The evolution of social learning rules: payoff-biased and frequency-dependent biased transmission. , 2009, Journal of theoretical biology.

[55]  Daisuke Nakanishi,et al.  Does social/cultural learning increase human adaptability?: Rogers's question revisited , 2003 .

[56]  Shinsuke Shimojo,et al.  Neural Computations Underlying Arbitration between Model-Based and Model-free Learning , 2013, Neuron.

[57]  I. Coolen,et al.  Species difference in adaptive use of public information in sticklebacks , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[58]  Daisuke Nakanishi,et al.  Cost–benefit analysis of social/cultural learning in a nonstationary uncertain environment: An evolutionary simulation and an experiment with human subjects , 2002 .

[59]  Peter Dayan,et al.  Hippocampal Contributions to Control: The Third Way , 2007, NIPS.

[60]  P. Dayan,et al.  States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning , 2010, Neuron.

[61]  Ewelina Knapska,et al.  The neural and computational systems of social learning , 2020, Nature Reviews Neuroscience.

[62]  A. Eiben,et al.  Combining Environment-Driven Adaptation and Task-Driven Optimisation in Evolutionary Robotics , 2014, PloS one.

[63]  Michel Tokic Adaptive ε-greedy Exploration in Reinforcement Learning Based on Value Differences , 2010 .

[64]  Andreas Keller,et al.  Swarm Learning for decentralized and confidential clinical machine learning , 2021, Nature.

[65]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..