Evolving intrinsic motivations for altruistic behavior

Multi-agent cooperation is an important feature of the natural world. Many tasks involve individual incentives that are misaligned with the common good, yet a wide range of organisms from bacteria to insects and humans are able to overcome their differences and collaborate. Therefore, the emergence of cooperative behavior amongst self-interested individuals is an important question for the fields of multi-agent reinforcement learning (MARL) and evolutionary theory. Here, we study a particular class of multi-agent problems called intertemporal social dilemmas (ISDs), where the conflict between the individual and the group is particularly sharp. By combining MARL with appropriately structured natural selection, we demonstrate that individual inductive biases for cooperation can be learned in a model-free way. To achieve this, we introduce an innovative modular architecture for deep reinforcement learning agents which supports multi-level selection. We present results in two challenging environments, and interpret these in the context of cultural and ecological evolution.

[1]  Ahmad B. Rad,et al.  Alleviating 'overfitting' via genetically-regularised neural network , 2002 .

[2]  E. Ostrom,et al.  Lab Experiments for the Study of Social-Ecological Systems , 2010, Science.

[3]  Richard L. Lewis,et al.  Where Do Rewards Come From , 2009 .

[4]  V. Jansen,et al.  Altruism through beard chromodynamics , 2006, Nature.

[5]  M. van vugt,et al.  Nice Guys Finish First: The Competitive Altruism Hypothesis , 2006, Personality & social psychology bulletin.

[6]  Joel Z. Leibo,et al.  Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.

[7]  M. Matarić Learning to Behave Socially , 1994 .

[8]  Pieter Abbeel,et al.  Evolved Policy Gradients , 2018, NeurIPS.

[9]  Michael Doebeli,et al.  A simple and general explanation for the evolution of altruism , 2009, Proceedings of the Royal Society B: Biological Sciences.

[10]  Max Jaderberg,et al.  Population Based Training of Neural Networks , 2017, ArXiv.

[11]  Joshua B. Tenenbaum,et al.  Coordinate to cooperate or compete: Abstract goals and joint intentions in social interaction , 2016, CogSci.

[12]  Geoffrey E. Hinton,et al.  How Learning Can Guide Evolution , 1996, Complex Syst..

[13]  J. Henrich Cultural group selection, coevolutionary processes and large-scale cooperation , 2004 .

[14]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[15]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[16]  Joel Z. Leibo,et al.  A multi-agent reinforcement learning model of common-pool resource appropriation , 2017, NIPS.

[17]  Risto Miikkulainen,et al.  COOPERATIVE COEVOLUTION OF MULTI-AGENT SYSTEMS , 2001 .

[18]  Stephen Moore Power and Corruption , 1998 .

[19]  Kocsis Zoltán Tamás,et al.  IEEE World Congress on Computational Intelligence , 2019, IEEE Computational Intelligence Magazine.

[20]  B. Rockenbach,et al.  The Competitive Advantage of Sanctioning Institutions , 2006, Science.

[21]  Francisco C. Santos,et al.  Cooperation Prevails When Individuals Adjust Their Social Ties , 2006, PLoS Comput. Biol..

[22]  D. Wilson A theory of group selection. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[24]  Peter M. Todd,et al.  Designing Neural Networks using Genetic Algorithms , 1989, ICGA.

[25]  Tom Schaul,et al.  Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[26]  Tom Schaul,et al.  Meta-learning by the Baldwin effect , 2018, GECCO.

[27]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[28]  A. Griffin,et al.  Social evolution theory for microorganisms , 2006, Nature Reviews Microbiology.

[29]  W. Hamilton The genetical evolution of social behaviour. II. , 1964, Journal of theoretical biology.

[30]  Joel Z. Leibo,et al.  Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.

[31]  Kenneth O. Stanley,et al.  Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.

[32]  A. Griffin,et al.  Kin selection: fact and fiction , 2002 .

[33]  Richard L. Lewis,et al.  Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.

[34]  Shimon Whiteson,et al.  Learning with Opponent-Learning Awareness , 2017, AAMAS.

[35]  Kenneth A. De Jong,et al.  Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents , 2000, Evolutionary Computation.

[36]  Marco Mirolli,et al.  Intrinsically Motivated Learning in Natural and Artificial Systems , 2013 .

[37]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[38]  A. Griffin,et al.  Cooperation and competition in pathogenic bacteria , 2004, Nature.

[39]  Maja J. Matarić,et al.  Leaning to behave socially , 1994 .

[40]  E. Fehr A Theory of Fairness, Competition and Cooperation , 1998 .

[41]  E. Ostrom,et al.  Covenants with and without a Sword: Self-Governance Is Possible , 1992, American Political Science Review.

[42]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[43]  Sam P. Brown,et al.  Horizontal Gene Transfer and The Evolution of Bacterial Cooperation , 2011, Evolution; international journal of organic evolution.

[44]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[45]  Alexander Peysakhovich,et al.  Maintaining cooperation in complex social dilemmas using deep reinforcement learning , 2017, ArXiv.

[46]  A. Grafen Do animals really recognize kin? , 1990, Animal Behaviour.

[47]  J. M. Smith Group Selection and Kin Selection , 1964, Nature.

[48]  Eörs Szathmáry,et al.  The Major Transitions in Evolution , 1997 .

[49]  M. Nowak,et al.  Evolution of indirect reciprocity , 2005, Nature.

[50]  W. Hamilton,et al.  The Evolution of Cooperation , 1984 .

[51]  R. Boyd,et al.  Culture and the evolution of human cooperation , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[52]  R. Lewontin ‘The Selfish Gene’ , 1977, Nature.

[53]  R. Trivers The Evolution of Reciprocal Altruism , 1971, The Quarterly Review of Biology.

[54]  L. Keller,et al.  Selfish genes: a green beard in the red fire ant , 1998, Nature.

[55]  Alexander Peysakhovich,et al.  Consequentialist conditional cooperation in social dilemmas with imperfect information , 2017, AAAI Workshops.

[56]  W. Hamilton The genetical evolution of social behaviour. I. , 1964, Journal of theoretical biology.

[57]  Leslie Pack Kaelbling,et al.  All learning is Local: Multi-agent Learning in Global Reward Games , 2003, NIPS.

[58]  M. Nowak Five Rules for the Evolution of Cooperation , 2006, Science.

[59]  P. Oliver Rewards and Punishments as Selective Incentives for Collective Action: Theoretical Investigations , 1980, American Journal of Sociology.