Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints

Abstract In this work, output-feedback control problems for a class of discrete-time non-affine nonlinear systems with unknown control directions and input constraints are considered by using reinforcement learning (RL) method. Two neural networks (NNs) implement the control: 1) a critic NN that estimates a non-quadratic strategic utility function (SUF) and 2) an action NN that generates optimized control input and minimizes the SUF. The implicit function theorem is applied to obtain the optimal control law since the control is appeared in a non-affine form. For the first time, the discrete Nussbaum gain is introduced to overcome the difficulty that the control directions are unknown and a non-quadratic SUF is used to deal with the control constraints in the RL-based control. The theoretical derivation of the uniformly ultimately boundedness of the NN weights and the closed-loop output tracking error is given. And two numerical examples have been supplied to valid the proposed method.

[1]  Frank L. Lewis,et al.  Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[2]  Xiong Yang,et al.  Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints , 2014, Int. J. Control.

[3]  Hongjing Liang,et al.  Nussbaum gain adaptive backstepping control of nonlinear strict‐feedback systems with unmodeled dynamics and unknown dead zone , 2018, International Journal of Robust and Nonlinear Control.

[4]  Huaguang Zhang,et al.  Adaptive Predefined Performance Control for MIMO Systems With Unknown Direction via Generalized Fuzzy Hyperbolic Model , 2017, IEEE Transactions on Fuzzy Systems.

[5]  Shuzhi Sam Ge,et al.  Adaptive output feedback NN control of a class of discrete-time MIMO nonlinear systems with unknown control directions , 2009, 2009 7th Asian Control Conference.

[6]  Shuzhi Sam Ge,et al.  Output Feedback NN Control for Two Classes of Discrete-Time Systems With Unknown Control Directions in a Unified Approach , 2008, IEEE Transactions on Neural Networks.

[7]  Frank L. Lewis,et al.  Adaptive Suboptimal Output-Feedback Control for Linear Systems Using Integral Reinforcement Learning , 2015, IEEE Transactions on Control Systems Technology.

[8]  Paul J. Werbos,et al.  Foreword: ADP - The Key Direction for Future Research in Intelligent Control and Understanding Brain Intelligence , 2008, IEEE Trans. Syst. Man Cybern. Part B.

[9]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[10]  Changyin Sun,et al.  Control Design of a Marine Vessel System Using Reinforcement Learning , 2018, Neurocomputing.

[11]  Zhongke Shi,et al.  Reinforcement Learning Output Feedback NN Control Using Deterministic Learning Technique , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Qinmin Yang,et al.  Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  K. Narendra,et al.  Stable discrete adaptive control with unknown high-frequency gain , 1986 .

[14]  Derong Liu,et al.  Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning , 2014, Neural Networks.

[15]  Shaocheng Tong,et al.  Barrier Lyapunov functions for Nussbaum gain adaptive control of full state constrained nonlinear systems , 2017, Autom..

[16]  Shixing Wang,et al.  MLP technique based reinforcement learning control of discrete pure-feedback systems , 2015, Neurocomputing.

[17]  Huaguang Zhang,et al.  Integral reinforcement learning based decentralized optimal tracking control of unknown nonlinear large-scale interconnected systems with constrained-input , 2019, Neurocomputing.

[18]  Shuzhi Sam Ge,et al.  Adaptive robust control of a class of nonlinear strict-feedback discrete-time systems with unknown control directions , 2008, Syst. Control. Lett..

[19]  Xinjun Wang,et al.  Adaptive neural tracking control for nonstrict‐feedback nonlinear systems with unknown backlash‐like hysteresis and unknown control directions , 2018 .

[20]  Shuzhi Sam Ge,et al.  Output feedback adaptive control of a class of nonlinear discrete-time systems with unknown control directions , 2009, Autom..

[21]  Radu-Emil Precup,et al.  Model-Free control performance improvement using virtual reference feedback tuning and reinforcement Q-learning , 2017, Int. J. Syst. Sci..