Supplementary Reinforcement Learning Controller Designed for Quadrotor UAVs

The control problem for quadrotor UAVs is difficult and challenging due to the complex nonlinear dynamics and ever-changing disturbances. In this paper, a supplementary controller based on reinforcement learning (RL) is proposed to improve the control performance of quadrotor UAVs. The proposed RL method is constructed by an actor-critic structure and some improved technologies, e.g., Q-learning, temporal difference, and experience replay. With the proposed method, the speed and stability of training can be improved greatly. On one hand, the supplementary controller can work together with the traditional controller online, which can guarantee the stability of the system. On the other hand, the model uncertainties and external disturbances could be restrained through online RL training. The Lyapunov theory is used to prove the convergence of the RL controller’s weights theoretically. Finally, three simulations are provided to illustrate the effectiveness of the proposed controller.

[1]  안경관,et al.  Adaptive tracking control of a quadrotor unmanned vehicle , 2015 .

[2]  Haibo He,et al.  Q-Learning-Based Vulnerability Analysis of Smart Grid Against Sequential Topology Attacks , 2017, IEEE Transactions on Information Forensics and Security.

[3]  Ning Wang,et al.  Backpropagating Constraints-Based Trajectory Tracking Control of a Quadrotor With Constrained Actuator Dynamics and Complex Unknowns , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Warren E. Dixon,et al.  Model-based reinforcement learning for approximate optimal regulation , 2016, Autom..

[5]  Chong Shen,et al.  Robust dynamic surface trajectory tracking control for a quadrotor UAV via extended state observer , 2018 .

[6]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[7]  Derong Liu,et al.  Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints , 2013 .

[8]  Di Shi,et al.  Generalized Extended State Observer Based High Precision Attitude Control of Quadrotor Vehicles Subject to Wind Disturbance , 2018, IEEE Access.

[9]  Gianluca Antonelli,et al.  Adaptive trajectory tracking for quadrotor MAVs in presence of parameter uncertainties and external disturbances , 2013, 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics.

[10]  M. Moghavvemi,et al.  Modelling and PID controller design for a quadrotor unmanned air vehicle , 2010, 2010 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR).

[11]  Huaguang Zhang,et al.  Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method , 2017, IEEE Transactions on Industrial Electronics.

[12]  Alexandre M. Bayen,et al.  Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines , 2018, ICLR.

[13]  Yisheng Zhong,et al.  Robust Attitude Stabilization for Nonlinear Quadrotor Systems With Uncertainties and Delays , 2017, IEEE Transactions on Industrial Electronics.

[14]  Lorenzo Marconi,et al.  Design of autonomous smartphone based quadrotor and implementation of navigation and guidance systems , 2018 .

[15]  Haibo He,et al.  Gr-GDHP: A New Architecture for Globalized Dual Heuristic Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[16]  Bangchun Wen,et al.  Output feedback observer-based dynamic surface controller for quadrotor UAV using quaternion representation , 2017 .

[17]  Luigi Fortuna,et al.  Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .

[18]  Haibo He,et al.  Learning Without External Reward [Research Frontier] , 2018, IEEE Computational Intelligence Magazine.

[19]  Frank L. Lewis,et al.  Backstepping Approach for Controlling a Quadrotor Using Lagrange Form Dynamics , 2009, J. Intell. Robotic Syst..

[20]  Davide Scaramuzza,et al.  Fast Trajectory Optimization for Agile Quadrotor Maneuvers with a Cable-Suspended Payload , 2017, Robotics: Science and Systems.

[21]  Maarouf Saad,et al.  Robust Observer-Based Dynamic Sliding Mode Controller for a Quadrotor UAV , 2018, IEEE Access.

[22]  Ning Sun,et al.  Nonlinear Hierarchical Control for Unmanned Quadrotor Transportation Systems , 2018, IEEE Transactions on Industrial Electronics.

[23]  Qun Zong,et al.  A Continuous Finite-Time Output Feedback Control Scheme and Its Application in Quadrotor UAVs , 2018, IEEE Access.

[24]  Roland Siegwart,et al.  PID vs LQ control techniques applied to an indoor micro quadrotor , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[25]  Derong Liu,et al.  Adaptive dynamic programming for robust neural control of unknown continuous-time non-linear systems , 2017 .