Self-Organised Swarm Flocking with Deep Reinforcement Learning

Optimising a set of parameters for swarm flocking is a tedious task as it requires hand-tuning of the parameters. In this paper, we developed a self-organised flocking mechanism with a swarm of homogeneous robots. The proposed mechanism used deep reinforcement learning to teach the swarm to perform the flocking in a continuous state and action space. Collective motion was represented by a self-organising dynamic model that is based on linear spring-like forces between self-propelled particles in an active crystal. We tuned the inverse rotational and translational damping coefficients of the dynamic model for swarm populations of $N\in \{25,\ 100\}$ E {25, 100} robots. We study the application of reinforcement learning in a centralised multi-agent approach, where we have a global state space matrix that is accessible by actor and critic networks. Furthermore, we showed that our method could train the system to flock regardless of the sparsity of the swarm population, which is a significant result.

[1]  Andriy Mnih,et al.  Q-Learning in enormous action spaces via amortized approximate maximization , 2020, ArXiv.

[2]  Olivier Sigaud,et al.  The problem with DDPG: understanding failures in deterministic environments with sparse rewards , 2019, ICANN.

[3]  Vicsek,et al.  Novel type of phase transition in a system of self-driven particles. , 1995, Physical review letters.

[4]  Thomas Schmickl,et al.  Swarm Intelligence and cyber-physical systems: Concepts, challenges and future trends , 2021, Swarm Evol. Comput..

[5]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[6]  G. Neumann,et al.  Inverse Reinforcement Learning of Bird Flocking Behavior , 2018 .

[7]  J. Toner,et al.  Flocks, herds, and schools: A quantitative theory of flocking , 1998, cond-mat/9804180.

[8]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[9]  Gerhard Neumann,et al.  Guided Deep Reinforcement Learning for Swarm Systems , 2017, ArXiv.

[10]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[11]  Barry Lennox,et al.  Cooperative Control of Heterogeneous Connected Vehicle Platoons: An Adaptive Leader-Following Approach , 2020, IEEE Robotics and Automation Letters.

[12]  Ali Emre Turgut,et al.  Investigation of cue-based aggregation in static and dynamic environments with a mobile robot swarm , 2016, Adapt. Behav..

[13]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[14]  Luca Maria Gambardella,et al.  Learning Real Team Solutions , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[15]  Arash Tavakoli,et al.  Action Branching Architectures for Deep Reinforcement Learning , 2017, AAAI.

[16]  Ali Emre Turgut,et al.  Self-organized Collective Motion with a Simulated Real Robot Swarm , 2019, TAROS.

[17]  Barry Lennox,et al.  Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforcement Learning , 2020, IEEE Transactions on Vehicular Technology.

[18]  Hyondong Oh,et al.  Bio-inspired self-organising multi-robot pattern formation: A review , 2017, Robotics Auton. Syst..

[19]  Ali Emre Turgut,et al.  Self-organized flocking in mobile robot swarms , 2008, Swarm Intelligence.

[20]  E.,et al.  Collective Motion Dynamics of Active Solids and Active Crystals Collective Motion Dynamics of Active Solids and Active Crystals Collective Motion Dynamics of Active Solids and Active Crystals 2 , 2022 .

[21]  Gerhard Neumann,et al.  Deep Reinforcement Learning for Swarm Systems , 2018, J. Mach. Learn. Res..

[22]  Curt Schurgers,et al.  A swarm of autonomous miniature underwater robot drifters for exploring submesoscale ocean dynamics , 2017, Nature Communications.