Event-triggered reinforcement learning control for the quadrotor UAV with actuator saturation

Abstract This paper proposes an event-triggered reinforcement learning (RL) control strategy to stabilize the quadrotor unmanned aerial vehicle (UAV) with actuator saturation. As the quadrotor UAV equips with a complex dynamic is difficult to be model accurately, a model free reinforcement learning scheme is designed. Due to the practical limitation of actuators, the end of controller is constrained with a bounded function. In order to reduce the calculation consumption for the onboard computer, an event-triggered mechanism is developed, which only update the controller when the triggered condition is satisfied. The proposed controller is implemented with two neural networks which are called critic and actor. Some advanced RL technologies are utilized for speeding up the train process, e.g. off-policy training, experience replay, etc. The stability of closed-loop system is proved by the Lyapunov analysis. The simulation results including a stability task and a tracking task verify the theoretical analysis, in which we find the updating frequency of controller is decreased greatly.

[1]  Xiaobo Lin,et al.  Supplementary Reinforcement Learning Controller Designed for Quadrotor UAVs , 2019, IEEE Access.

[2]  Frank L. Lewis,et al.  Robust formation tracking control for multiple quadrotors under aggressive maneuvers , 2019, Autom..

[3]  Yibin Li,et al.  Robust tracking control strategy for a quadrotor using RPD-SMC and RISE , 2019, Neurocomputing.

[4]  Swee King Phang,et al.  Nonlinear Flight Control Design for Maneuvering Flight of Quadrotors in High Speed and Large Acceleration , 2018, 2018 International Conference on Unmanned Aircraft Systems (ICUAS).

[5]  Emanuele Garone,et al.  Control of Fully Actuated Unmanned Aerial Vehicles with Actuator Saturation , 2017 .

[6]  Changyin Sun,et al.  An Event-Triggered Approach for Load Frequency Control With Supplementary ADP , 2017, IEEE Transactions on Power Systems.

[7]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[8]  Haibo He,et al.  An Event-Triggered ADP Control Approach for Continuous-Time System With Unknown Internal States , 2017, IEEE Transactions on Cybernetics.

[9]  Haibo He,et al.  Event-Triggered Optimal Control for Partially Unknown Constrained-Input Systems via Adaptive Dynamic Programming , 2017, IEEE Transactions on Industrial Electronics.

[10]  Frank L. Lewis,et al.  Backstepping Approach for Controlling a Quadrotor Using Lagrange Form Dynamics , 2009, J. Intell. Robotic Syst..

[11]  Yaonan Wang,et al.  Adaptive RBFNNs/integral sliding mode control for a quadrotor aircraft , 2016, Neurocomputing.

[12]  Zhanshan Wang,et al.  Data-Based Optimal Control of Multiagent Systems: A Reinforcement Learning Design Approach , 2017, IEEE Transactions on Cybernetics.

[13]  Jian Liu,et al.  Fixed-Time Leader–Follower Consensus of Networked Nonlinear Systems via Event/Self-Triggered Control , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Yao Yu,et al.  Fixed-Time Event-Triggered Consensus for Nonlinear Multiagent Systems Without Continuous Communications , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[15]  Yong Zhang,et al.  Learning-Based Robust Tracking Control of Quadrotor With Time-Varying and Coupling Uncertainties , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Tao Feng,et al.  Distributed Optimal Consensus Control for Nonlinear Multiagent System With Unknown Dynamic , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[18]  Rogelio Lozano,et al.  Second order sliding mode controllers for altitude control of a quadrotor UAS: Real-time implementation in outdoor environments , 2017, Neurocomputing.

[19]  Haibo He,et al.  Gr-GDHP: A New Architecture for Globalized Dual Heuristic Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[20]  Nicolas Mansard,et al.  Trajectory generation for quadrotor based systems using numerical optimal control , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Anuradha M. Annaswamy,et al.  Adaptive Control of Quadrotor UAVs: A Design Trade Study With Flight Evaluations , 2013, IEEE Transactions on Control Systems Technology.

[22]  Yuanqing Xia,et al.  Obstacle avoidance and active disturbance rejection control for a quadrotor , 2016, Neurocomputing.

[23]  Paulo Tabuada,et al.  Event-Triggered Real-Time Scheduling of Stabilizing Control Tasks , 2007, IEEE Transactions on Automatic Control.

[24]  Haibo He,et al.  Novel iterative neural dynamic programming for data-based approximate optimal control design , 2017, Autom..

[25]  Xiaobo Lin,et al.  A decoupling control for quadrotor UAV using dynamic surface control and sliding mode disturbance observer , 2019, Nonlinear Dynamics.

[26]  Ben M. Chen,et al.  Safe navigation of quadrotors with jerk limited trajectory , 2019, Frontiers of Information Technology & Electronic Engineering.

[27]  Derong Liu,et al.  Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints , 2013 .

[28]  Roland Siegwart,et al.  Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.

[29]  R. Bellman A Markovian Decision Process , 1957 .