Emergent Resource Exchange and Tolerated Theft Behavior Using Multiagent Reinforcement Learning

For decades, the evolution of cooperation has piqued the interest of numerous academic disciplines such as game theory, economics, biology, and computer science. In this work, we demonstrate the emergence of a novel and effective resource exchange protocol formed by dropping and picking up resources in a foraging environment. This form of cooperation is made possible by the introduction of a campfire, which adds an extended period of congregation and downtime for agents to explore otherwise unlikely interactions. We find that the agents learn to avoid getting cheated by their exchange partners, but not always from a third party. We also observe the emergence of behavior analogous to tolerated theft, despite the lack of any punishment, combat, or larceny mechanism in the environment.

[1]  Michael S. Bernstein,et al.  Generative Agents: Interactive Simulacra of Human Behavior , 2023, UIST.

[2]  E. Silverman,et al.  Self-Isolation and Testing Behaviour During the COVID-19 Pandemic: An Agent-Based Model , 2022, Artificial Life.

[3]  J. Bijak,et al.  The Effects of Information on the Formation of Migration Routes and the Dynamics of Migration , 2022, Artificial Life.

[4]  Michael S. Bernstein,et al.  Social Simulacra: Creating Populated Prototypes for Social Computing Systems , 2022, UIST.

[5]  Joel Z. Leibo,et al.  Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning , 2022, ArXiv.

[6]  Ashley D. Edwards,et al.  Learning Robust Real-Time Cultural Transmission without Human Data , 2022, ArXiv.

[7]  Marc Lanctot,et al.  Dynamic population-based meta-learning for multi-agent communication with natural language , 2021, NeurIPS.

[8]  Joel Z. Leibo,et al.  A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings , 2021, Collective Intelligence.

[9]  Tidor-Vlad Pricope,et al.  Deep Reinforcement Learning in Quantitative Algorithmic Trading: A Review , 2021, ArXiv.

[10]  A. E. Eiben,et al.  A coevolutionary approach to deep multi-agent reinforcement learning , 2021, GECCO Companion.

[11]  Yu Wang,et al.  The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games , 2021, NeurIPS.

[12]  Shimon Whiteson,et al.  Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? , 2020, ArXiv.

[13]  Angeliki Lazaridou,et al.  Emergent Multi-Agent Communication in the Deep Learning Era , 2020, ArXiv.

[14]  David C. Parkes,et al.  The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies , 2020, ArXiv.

[15]  Igor Mordatch,et al.  Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.

[16]  Joel Z. Leibo,et al.  Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research , 2019, ArXiv.

[17]  Joel Z. Leibo,et al.  Malthusian Reinforcement Learning , 2018, AAMAS.

[18]  Nando de Freitas,et al.  Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.

[19]  Shimon Whiteson,et al.  QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[20]  Joel Z. Leibo,et al.  Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.

[21]  Kenneth O. Stanley,et al.  Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.

[22]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[23]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[24]  Alexander Peysakhovich,et al.  Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.

[25]  Igor Mordatch,et al.  A Paradigm for Situated and Goal-Driven Language Learning , 2016, ArXiv.

[26]  H. P. de Vladar,et al.  Why Greatness Cannot Be Planned: The Myth of the Objective , 2016, Leonardo.

[27]  J. Henrich The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter , 2015 .

[28]  J. Pollack,et al.  Challenges in coevolutionary learning: arms-race dynamics, open-endedness, and medicocre stable states , 1998 .

[29]  Joshua M. Epstein,et al.  Growing Artificial Societies: Social Science from the Bottom Up , 1996 .

[30]  D. E. Stuart,et al.  Food Sharing Among Ache Foragers: Tests of Explanatory Hypotheses [and Comments and Reply] , 1985, Current Anthropology.

[31]  G. Isaac The food-sharing behavior of protohuman hominids. , 1978, Scientific American.

[32]  S. Gould,et al.  Punctuated equilibria: the tempo and mode of evolution reconsidered , 1977, Paleobiology.

[33]  C. B. Colby The weirdest people in the world , 1973 .

[34]  Joel Z. Leibo,et al.  Deep reinforcement learning models the emergent dynamics of human cooperation , 2021, ArXiv.

[35]  Kristian Lindgren,et al.  Evolutionary phenomena in simple dynamics , 1992 .

[36]  W. Hamilton,et al.  The Market for " Lemons " : Quality Uncertainty and the Market Mechanism , 1981 .