Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Stochastic Disturbances

In this paper, a learning-based predictive control (LPC) scheme is proposed for adaptive optimal control of discrete-time nonlinear systems under stochastic disturbances. The proposed LPC scheme is different from conventional model predictive control (MPC), which uses open-loop optimization or simplified closed-loop optimal control techniques in each horizon. In LPC, the control task in each horizon is formulated as a closed-loop nonlinear optimal control problem and a finite-horizon iterative reinforcement learning (RL) algorithm is developed to obtain the closed-loop optimal/suboptimal solutions. Therefore, in LPC, RL and adaptive dynamic programming (ADP) are used as a new class of closed-loop learning-based optimization techniques for nonlinear predictive control with stochastic disturbances. Moreover, LPC also decomposes the infinite-horizon optimal control problem in previous RL and ADP methods into a series of finite horizon problems, so that the computational costs are reduced and the learning efficiency can be improved. Convergence of the finite-horizon iterative RL algorithm in each prediction horizon and the Lyapunov stability of the closed-loop control system are proved. Moreover, by using successive policy updates between adjoint time horizons, LPC also has lower computational costs than conventional MPC which has independent optimization procedures between two different prediction horizons. Simulation results illustrate that compared with conventional nonlinear MPC as well as ADP, the proposed LPC scheme can obtain a better performance both in terms of policy optimality and computational efficiency.

[1]  A. Heydari,et al.  Finite-horizon input-constrained nonlinear optimal control using single network adaptive critics , 2011, Proceedings of the 2011 American Control Conference.

[2]  Hong Chen,et al.  Moving horizon I control with performance adaptation for constrained linear systems , 2006, Autom..

[3]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[4]  Hao Xu,et al.  Neural network based finite horizon stochastic optimal controller design for nonlinear networked control systems , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[5]  Derong Liu,et al.  Finite-horizon neural optimal tracking control for a class of nonlinear systems with unknown dynamics , 2012, Proceedings of the 10th World Congress on Intelligent Control and Automation.

[6]  Haibo He,et al.  Near-Optimal Tracking Control of Mobile Robots Via Receding-Horizon Dual Heuristic Programming , 2016, IEEE Transactions on Cybernetics.

[7]  Hao Xu,et al.  Stochastic Optimal Controller Design for Uncertain Nonlinear Networked Control System via Neuro Dynamic Programming , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Eric C. Kerrigan,et al.  Optimization over state feedback policies for robust control with constraints , 2006, Autom..

[9]  Manfred Morari,et al.  Robust constrained model predictive control using linear matrix inequalities , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[10]  M.S. Abdulla,et al.  Parametrized Actor-Critic Algorithms for Finite-Horizon MDPs , 2007, 2007 American Control Conference.

[11]  Masahiro Ono,et al.  A Probabilistic Particle-Control Approximation of Chance-Constrained Stochastic Predictive Control , 2010, IEEE Transactions on Robotics.

[12]  Frank Allgöwer,et al.  Inherent robustness properties of quasi-infinite horizon nonlinear model predictive control , 2014, Autom..

[13]  Manfred Morari,et al.  Stochastic Model Predictive Control for Building Climate Control , 2014, IEEE Transactions on Control Systems Technology.

[14]  Avimanyu Sahoo,et al.  Neural network-based adaptive event-triggered control of nonlinear continuous-time systems , 2013, 2013 IEEE International Symposium on Intelligent Control (ISIC).

[15]  Paul J. Werbos,et al.  Approximate dynamic programming for real-time control and neural modeling , 1992 .

[16]  F. Allgöwer,et al.  A quasi-infinite horizon nonlinear model predictive control scheme with guaranteed stability , 1997 .

[17]  David Q. Mayne,et al.  Robust model predictive control using tubes , 2004, Autom..

[18]  Sarangapani Jagannathan,et al.  A self-tuning optimal controller for affine nonlinear continuous-time systems with unknown internal dynamics , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[19]  Sarangapani Jagannathan,et al.  Optimal adaptive control of nonlinear continuous-time systems in strict feedback form with unknown internal dynamics , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[20]  Hao Xu,et al.  Finite-horizon optimal adaptive neural network control of uncertain nonlinear discrete-time systems , 2013, 2013 IEEE International Symposium on Intelligent Control (ISIC).

[21]  Xin Xu,et al.  Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.

[22]  D. Mayne,et al.  Min-max feedback model predictive control for constrained linear systems , 1998, IEEE Trans. Autom. Control..

[23]  David Q. Mayne,et al.  Constrained model predictive control: Stability and optimality , 2000, Autom..

[24]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC , 2005, Eur. J. Control.

[25]  David Q. Mayne,et al.  Tube‐based robust nonlinear model predictive control , 2011 .

[26]  Giuseppe Carlo Calafiore,et al.  Randomized Model Predictive Control for stochastic linear systems , 2012, 2012 American Control Conference (ACC).

[27]  Haibo He,et al.  Online Learning Control Using Adaptive Critic Designs With Sparse Kernel Machines , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[29]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[30]  Chen Wang,et al.  Constrained linear system with disturbance: Convergence under disturbance feedback , 2008, Autom..

[31]  Andrew R. Teel,et al.  Nominally Robust Model Predictive Control With State Constraints , 2007, IEEE Transactions on Automatic Control.

[32]  D. Liu,et al.  Adaptive Dynamic Programming for Finite-Horizon Optimal Control of Discrete-Time Nonlinear Systems With $\varepsilon$-Error Bound , 2011, IEEE Transactions on Neural Networks.

[33]  Simon X. Yang,et al.  Hierarchical Approximate Policy Iteration With Binary-Tree State Space Decomposition , 2011, IEEE Transactions on Neural Networks.

[34]  Shalabh Bhatnagar,et al.  Natural actor-critic algorithms , 2009, Autom..

[35]  Hong Chen,et al.  A Feasible Moving Horizon ${\cal H}_{\infty}$ Control Scheme for Constrained Uncertain Linear Systems , 2007, IEEE Transactions on Automatic Control.

[36]  I Ivo Batina,et al.  Model predictive control for stochastic systems by randomized algorithms , 2004 .