Evolution of a Complex Predator-Prey Ecosystem on Large-scale Multi-Agent Deep Reinforcement Learning

Simulation of population dynamics is a central research theme in computational biology, which contributes to understanding the interactions between predators and preys. Conventional mathematical tools of this theme, however, are incapable of accounting for several important attributes of such systems, such as the intelligent and adaptive behavior exhibited by individual agents. This unrealistic setting is often insufficient to simulate properties of population dynamics found in the real-world. In this work, we leverage multi-agent deep reinforcement learning, and we propose a new model of large-scale predator-prey ecosystems. Using different variants of our proposed environment, we show that multi-agent simulations can exhibit key real-world dynamical properties. To obtain this behavior, we firstly define a mating mechanism such that existing agents reproduce new individuals bound by the conditions of the environment. Furthermore, we incorporate a real-time evolutionary algorithm and show that reinforcement learning enhances the evolution of the agents' physical properties such as speed, attack and resilience against attacks.

[1]  Ming Liu,et al.  Deep-learning in Mobile Robotics - from Perception to Control Systems: A Survey on Why and Why not , 2016, ArXiv.

[2]  E. Brodie,et al.  Predator-Prey Arms Races Asymmetrical selection on predators and prey may be reduced when prey are dangerous , 1999 .

[3]  Weinan Zhang,et al.  MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence , 2017, AAAI.

[4]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[5]  Saeid Nahavandi,et al.  Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications , 2018, ArXiv.

[6]  Hai Nguyen,et al.  Review of Deep Reinforcement Learning for Robot Manipulation , 2019, 2019 Third IEEE International Conference on Robotic Computing (IRC).

[7]  Jakub W. Pachocki,et al.  Emergent Complexity via Multi-Agent Competition , 2017, ICLR.

[8]  Risto Miikkulainen,et al.  Incremental Evolution of Complex General Behavior , 1997, Adapt. Behav..

[9]  Megan M. Olsen,et al.  Co-evolution in predator prey through reinforcement learning , 2015, J. Comput. Sci..

[10]  S. L. Lima,et al.  Behavioral decisions made under the risk of predation: a review and prospectus , 1990 .

[11]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[12]  Lev R. Ginzburg Book Review:Modelling Fluctuating Populations. R. M. Nisbet, W. S. C. Gurney , 1983 .

[13]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[14]  Colin R. Reeves,et al.  Genetic Algorithms: Principles and Perspectives: A Guide to Ga Theory , 2002 .

[15]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[16]  G. Helfman Threat-sensitive predator avoidance in damselfish-trumpetfish interactions , 2004, Behavioral Ecology and Sociobiology.

[17]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[18]  S. L. Lima,et al.  Temporal Variation in Danger Drives Antipredator Behavior: The Predation Risk Allocation Hypothesis , 1999, The American Naturalist.

[19]  Saeid Nahavandi,et al.  Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications , 2018, IEEE Transactions on Cybernetics.

[20]  Lantao Yu,et al.  An Empirical Study of AI Population Dynamics with Million-agent Reinforcement Learning , 2017, ArXiv.

[21]  Jacob Schrum Competition Between Reinforcement Learning Methods in a Predator-Prey Grid World , 2008 .

[22]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[23]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[24]  Ulf Dieckmann,et al.  A tale of two cycles – distinguishing quasi-cycles and limit cycles in finite predator–prey populations , 2007 .

[25]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[26]  Xueting Wang,et al.  Deep-Reinforcement Learning-Based Co-Evolution in a Predator–Prey System , 2019, Entropy.

[27]  Jing Li,et al.  Pick your trade-offs wisely: Predator-prey eco-evo dynamics are qualitatively different under different trade-offs. , 2018, Journal of theoretical biology.

[28]  Dorian Kodelja,et al.  Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.

[29]  Alan J McKane,et al.  Quasicycles in a spatial predator-prey model. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  A. J. Lotka Elements of Physical Biology. , 1925, Nature.

[31]  Peter Turchin,et al.  Complex Population Dynamics: A Theoretical/Empirical Synthesis , 2013 .

[32]  P. Dayan,et al.  Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.

[33]  W. Ross Ashby,et al.  Principles of the Self-Organizing System , 1991 .