Neural Flocking: MPC-Based Supervised Learning of Flocking Controllers

We show how a symmetric and fully distributed flocking controller can be synthesized using Deep Learning from a centralized flocking controller. Our approach is based on Supervised Learning, with the centralized controller providing the training data, in the form of trajectories of state-action pairs. We use Model Predictive Control (MPC) for the centralized controller, an approach that we have successfully demonstrated on flocking problems. MPC-based flocking controllers are high-performing but also computationally expensive. By learning a symmetric and distributed neural flocking controller from a centralized MPC-based one, we achieve the best of both worlds: the neural controllers have high performance (on par with the MPC controllers) and high efficiency. Our experimental results demonstrate the sophisticated nature of the distributed controllers we learn. In particular, the neural controllers are capable of achieving myriad flocking-oriented control objectives, including flocking formation, collision avoidance, obstacle avoidance, predator avoidance, and target seeking. Moreover, they generalize the behavior seen in the training data to achieve these objectives in a significantly broader range of scenarios. In terms of verification of our neural flocking controller, we use a form of statistical model checking to compute confidence intervals for its convergence rate and time to convergence.

[1]  William B. Dunbar,et al.  Distributed receding horizon control of multiagent systems , 2004 .

[2]  C. R. Ramakrishnan,et al.  Using Statistical Model Checking for Measuring Systems , 2014, ISoLA.

[3]  Erdal Kayacan,et al.  Knowledge Transfer Between Robots with Similar Dynamics for High-Accuracy Impromptu Trajectory Tracking , 2019, 2019 18th European Control Conference (ECC).

[4]  W ReynoldsCraig Flocks, herds and schools: A distributed behavioral model , 1987 .

[5]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[6]  Sergey Levine,et al.  Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Samir Bouabdallah,et al.  Design and control of quadrotors with application to autonomous flying , 2007 .

[8]  Roland Siegwart,et al.  From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Koki Shimada,et al.  Learning how to flock: deriving individual behaviour from collective behaviour with multi-agent reinforcement learning and natural evolution strategies , 2018, GECCO.

[10]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1998 .

[11]  Reza Olfati-Saber,et al.  Flocking for multi-agent dynamic systems: algorithms and theory , 2006, IEEE Transactions on Automatic Control.

[12]  Angela P. Schoellig,et al.  Multi-robot transfer learning: A dynamical system perspective , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  R. Fletcher Practical Methods of Optimization , 1988 .

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  J. Maciejowski,et al.  Soft constraints and exact penalty functions in model predictive control , 2000 .

[16]  Kim G. Larsen,et al.  Statistical Model Checking: Past, Present, and Future , 2016, ISoLA.

[17]  Thomas Hérault,et al.  Approximate Probabilistic Model Checking , 2004, VMCAI.

[18]  Sergey Levine,et al.  Uncertainty-Aware Reinforcement Learning for Collision Avoidance , 2017, ArXiv.

[19]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[20]  Craig W. Reynolds Steering Behaviors For Autonomous Characters , 1999 .

[21]  Guanrong Chen,et al.  Model predictive flocking control for second-order multi-agent systems with input constraints , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.

[22]  Ashish Tiwari,et al.  Declarative vs rule-based control for flocking dynamics , 2017, SAC.

[23]  Weihua Sheng,et al.  Multirobot Cooperative Learning for Predator Avoidance , 2015, IEEE Transactions on Control Systems Technology.

[24]  Alexandre Alahi,et al.  Crowd-Robot Interaction: Crowd-Aware Robot Navigation With Attention-Based Deep Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[25]  Jonathan P. How,et al.  Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Marko Bacic,et al.  Model predictive control , 2003 .

[27]  Jingyuan Zhan,et al.  Flocking of Multi-Agent Systems Via Model Predictive Control Based on Position-Only Measurements , 2013, IEEE Transactions on Industrial Informatics.

[28]  Maria L. Gini,et al.  Moving in a Crowd: Safe and Efficient Navigation among Heterogeneous Agents , 2016, IJCAI.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.