Apprenticeship Bootstrapping Via Deep Learning with a Safety Net for UAV-UGV Interaction

In apprenticeship learning (AL), agents learn by watching or acquiring human demonstrations on some tasks of interest. However, the lack of human demonstrations in novel tasks where they may not be a human expert yet, or when it is too expensive and/or time consuming to acquire human demonstrations motivated a new algorithm: Apprenticeship bootstrapping (ABS). The basic idea is to learn from demonstrations on sub-tasks then autonomously bootstrap a model on the main, more complex, task. The original ABS used inverse reinforcement learning (ABS-IRL). However, the approach is not suitable for continuous action spaces. In this paper, we propose ABS via Deep learning (ABS-DL). It is first validated in a simulation environment on an aerial and ground coordination scenario, where an Unmanned Aerial Vehicle (UAV) is required to maintain three Unmanned Ground Vehicles (UGVs) within a field of view of the UAV 's camera (FoV). Moving a machine learning algorithm from a simulation environment to an actual physical platform is challenging because `mistakes' made by the algorithm while learning could lead to the damage of the platform. We then take this extra step to test the algorithm in a physical environment. We propose a safety-net as a protection layer to ensure that the autonomy of the algorithm in learning does not compromise the safety of the platform. The tests of ABS-DL in the real environment can guarantee a damage-free, collision avoidance behaviour of autonomous bodies. The results show that performance of the proposed approach is comparable to that of a human, and competitive to the traditional approach using expert demonstrations performed on the composite task. The proposed safety-net approach demonstrates its advantages when it enables the UAV to operate more safely under the control of the ABS-DL algorithm.

[1]  Corrado Guarino Lo Bianco,et al.  Online velocity planner for Laser Guided Vehicles subject to safety constraints , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[3]  Randal W. Beard,et al.  Cooperative Path Planning for Target Tracking in Urban Environments Using Unmanned Air and Ground Vehicles , 2015, IEEE/ASME Transactions on Mechatronics.

[4]  Peter F. Sturm,et al.  Pinhole Camera Model , 2014, Computer Vision, A Reference Guide.

[5]  Thomas Hellström,et al.  A formalism for learning from demonstration , 2010, Paladyn J. Behav. Robotics.

[6]  Giovanni Miraglia,et al.  Dynamic geo-fence assurance and recovery for nonholonomic autonomous aerial vehicles , 2017, 2017 IEEE/AIAA 36th Digital Avionics Systems Conference (DASC).

[7]  Wei Zhan,et al.  Spatially-partitioned environmental representation and planning architecture for on-road autonomous driving , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[8]  Jun Chen,et al.  Double-Task Deep Q-Learning with Multiple Views , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[9]  Mohsen Jafari,et al.  A Data-Driven Approach for Driving Safety Risk Prediction Using Driver Behavior and Roadway Information Data , 2018, IEEE Transactions on Intelligent Transportation Systems.

[10]  Roland Siegwart,et al.  Full control of a quadrotor , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Andreas Zell,et al.  Automatic Take Off, Tracking and Landing of a Miniature UAV on a Moving Carrier Vehicle , 2011, J. Intell. Robotic Syst..

[12]  Hussein A. Abbass,et al.  Supervised deep actor network for imitation learning in a ground-air UAV-UGVs coordination task , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[13]  Wolfgang Utschick,et al.  A machine learning based biased-sampling approach for planning safe trajectories in complex, dynamic traffic-scenarios , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[14]  Stephen Gareth Pierce,et al.  Bipartite Guidance, Navigation and Control Architecture for Autonomous Aerial Inspections Under Safety Constraints , 2019, J. Intell. Robotic Syst..

[15]  Andrzej Koszewnik The Parrot UAV Controlled by PID Controllers , 2014 .

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  H.H.T. Liu,et al.  A cooperative UAV/UGV platform for wildfire detection and fighting , 2008, 2008 Asia Simulation Conference - 7th International Conference on System Simulation and Scientific Computing.

[18]  Kyongsu Yi,et al.  Stochastic Model-Predictive Control for Lane Change Decision of Automated Driving Vehicles , 2018, IEEE Transactions on Vehicular Technology.

[19]  Ata M. Khan Autonomous Vehicles: Reliability of Their Perception of the World Around Them and the Role of Human Driver , 2017 .

[20]  Haobin Jiang,et al.  Study on Path Planning Method for Imitating the Lane-Changing Operation of Excellent Drivers , 2018 .

[21]  Hussein A. Abbass,et al.  Apprenticeship Bootstrapping , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[22]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[23]  Marcus Sonntag,et al.  Planning near time-optimal trajectories in 3D , 2017, 2017 IEEE Conference on Control Technology and Applications (CCTA).

[24]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[25]  Sebastian Klemm,et al.  Integrating end-to-end learned steering into probabilistic autonomous driving , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[26]  Mohamed Medhat Gaber,et al.  Imitation Learning , 2017, ACM Comput. Surv..

[27]  Kyongsu Yi,et al.  Probabilistic prediction based automated driving motion planning algorithm for lane change , 2017, 2017 17th International Conference on Control, Automation and Systems (ICCAS).