Formation Control With Collision Avoidance Through Deep Reinforcement Learning Using Model-Guided Demonstration

Generating collision-free, time-efficient paths in an uncertain dynamic environment poses huge challenges for the formation control with collision avoidance (FCCA) problem in a leader-follower structure. In particular, the followers have to take both formation maintenance and collision avoidance into account simultaneously. Unfortunately, most of the existing works are simple combinations of methods dealing with the two problems separately. In this article, a new method based on deep reinforcement learning (RL) is proposed to solve the problem of FCCA. Especially, the learning-based policy is extended to the field of formation control, which involves a two-stage training framework: an imitation learning (IL) and later an RL. In the IL stage, a model-guided method consisting of a consensus theory-based formation controller and an optimal reciprocal collision avoidance strategy is designed to speed up training and increase efficiency. In the RL stage, a compound reward function is presented to guide the training. In addition, we design a formation-oriented network structure to perceive the environment. Long short-term memory is adopted to enable the network structure to perceive the information of obstacles of an uncertain number, and a transfer training approach is adopted to improve the generalization of the network in different scenarios. Numerous representative simulations are conducted, and our method is further deployed to an experimental platform based on a multiomnidirectional-wheeled car system. The effectiveness and practicability of our proposed method are validated through both the simulation and experiment results.

[1]  Alexandre Alahi,et al.  Crowd-Robot Interaction: Crowd-Aware Robot Navigation With Attention-Based Deep Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[2]  Dusan M. Stipanovic,et al.  Formation Control and Collision Avoidance for Multi-agent Non-holonomic Systems: Theory and Experiments , 2008, Int. J. Robotics Res..

[3]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[4]  K. D. Do,et al.  Formation control of multiple elliptical agents with limited sensing ranges , 2012, Autom..

[5]  Zhang Yu,et al.  Path-guided time-varying formation control with collision avoidance and connectivity preservation of under-actuated autonomous surface vehicles subject to unknown input gains , 2019, Ocean Engineering.

[6]  Jonathan P. How,et al.  Socially aware motion planning with deep reinforcement learning , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Razvan Pascanu,et al.  Relational Deep Reinforcement Learning , 2018, ArXiv.

[8]  Xu Jin,et al.  Nonrepetitive Leader–Follower Formation Tracking for Multiagent Systems With LOS Range and Angle Constraints Using Iterative Learning Control , 2019, IEEE Transactions on Cybernetics.

[9]  Tom Schaul,et al.  Deep Q-learning From Demonstrations , 2017, AAAI.

[10]  Geraint Rees,et al.  Clinically applicable deep learning for diagnosis and referral in retinal disease , 2018, Nature Medicine.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Antonios Tsourdos,et al.  Collision Avoidance Strategies for Unmanned Aerial Vehicles in Formation Flight , 2017, IEEE Transactions on Aerospace and Electronic Systems.

[13]  Jian Guo,et al.  Optimal formation control and collision avoidance in environment with multiple rectangle obstacles , 2018, J. Frankl. Inst..

[14]  C. L. Philip Chen,et al.  Formation Control of Leader–Follower Mobile Robots’ Systems Using Model Predictive Control Based on Neural-Dynamic Optimization , 2016, IEEE Transactions on Industrial Electronics.

[15]  Tucker R. Balch,et al.  Behavior-based formation control for multirobot teams , 1998, IEEE Trans. Robotics Autom..

[16]  Yong-Gi Kim,et al.  Type-2 fuzzy ontology-based semantic knowledge for collision avoidance of autonomous underwater vehicles , 2015, Inf. Sci..

[17]  Shuzhi Sam Ge,et al.  Vision-Based Leader–Follower Formation Control of Multiagents With Visibility Constraints , 2019, IEEE Transactions on Control Systems Technology.

[18]  Weisheng Yan,et al.  Mutual Information-Based Multi-AUV Path Planning for Scalar Field Sampling Using Multidimensional RRT* , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[19]  Kar-Han Tan,et al.  High Precision Formation Control of Mobile Robots Using Virtual Structures , 1997, Auton. Robots.

[20]  Randal W. Beard,et al.  Consensus seeking in multiagent systems under dynamically changing interaction topologies , 2005, IEEE Transactions on Automatic Control.

[21]  Paul A. Beardsley,et al.  Reciprocal collision avoidance for multiple car-like robots , 2012, 2012 IEEE International Conference on Robotics and Automation.

[22]  Xiangyu Wang,et al.  Finite-time consensus and collision avoidance control algorithms for multiple AUVs , 2013, Autom..

[23]  Cong Wang,et al.  Cooperative Deterministic Learning-Based Formation Control for a Group of Nonlinear Uncertain Mechanical Systems , 2019, IEEE Transactions on Industrial Informatics.

[24]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[25]  Weidong Zhang,et al.  Leader-follower formation control of underactuated surface vehicles based on sliding mode control and parameter estimation. , 2017, ISA transactions.

[26]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[27]  Qichao Zhang,et al.  Reinforcement Learning and Deep Learning Based Lateral Control for Autonomous Driving [Application Notes] , 2019, IEEE Comput. Intell. Mag..

[28]  Jia Pan,et al.  Deep-Learned Collision Avoidance Policy for Distributed Multiagent Navigation , 2016, IEEE Robotics and Automation Letters.

[29]  Toru Namerikawa,et al.  Consensus-based cooperative formation control with collision avoidance for a multi-UAV system , 2014, 2014 American Control Conference.

[30]  Tieshan Li,et al.  Adaptive leader-following formation control with collision avoidance for a class of second-order nonlinear multi-agent systems , 2019, Neurocomputing.

[31]  Jia Pan,et al.  Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios , 2018, ArXiv.

[32]  Jianqiang Yi,et al.  Formation Control with Collision Avoidance through Deep Reinforcement Learning , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[33]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[34]  Zhouhua Peng,et al.  Distributed Maneuvering of Autonomous Surface Vehicles Based on Neurodynamic Optimization and Fuzzy Approximation , 2018, IEEE Transactions on Control Systems Technology.

[35]  Peng Shi,et al.  Distributed Finite-Time Containment Control for Double-Integrator Multiagent Systems , 2014, IEEE Transactions on Cybernetics.

[36]  Changyin Sun,et al.  Learning to Navigate Through Complex Dynamic Environment With Modular Deep Reinforcement Learning , 2018, IEEE Transactions on Games.

[37]  Lionel Lapierre,et al.  Distributed Control of Coordinated Path Tracking for Networked Nonholonomic Mobile Vehicles , 2013, IEEE Transactions on Industrial Informatics.

[38]  Hyun Myung,et al.  Receding horizon particle swarm optimisation-based formation control with collision avoidance for non-holonomic mobile robots , 2015 .

[39]  Domenico Prattichizzo,et al.  Discussion of paper by , 2003 .

[40]  Jonathan P. How,et al.  Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[41]  Paul A. Beardsley,et al.  Optimal Reciprocal Collision Avoidance for Multiple Non-Holonomic Robots , 2010, DARS.

[42]  Yan-Jun Liu,et al.  Formation Control With Obstacle Avoidance for a Class of Stochastic Multiagent Systems , 2018, IEEE Transactions on Industrial Electronics.

[43]  Hao Zhang,et al.  Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Yuanqing Xia,et al.  Formation control and collision avoidance for multi-agent systems based on position estimation. , 2016, ISA transactions.

[45]  Jur P. van den Berg,et al.  Generalized reciprocal collision avoidance , 2015, Int. J. Robotics Res..

[46]  Jonathan P. How,et al.  Aircraft trajectory planning with collision avoidance using mixed integer linear programming , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).