System Stability of Learning-Based Linear Optimal Control With General Discounted Value Iteration

For discounted optimal regulation design, the stability of the controlled system is affected by the discount factor. If an inappropriate discount factor is employed, the optimal control policy might be unstabilizing. Therefore, in this article, the effect of the discount factor on the stabilization of control strategies is discussed. We develop the system stability criterion and the selection rules of the discount factor with respect to the linear quadratic regulator problem under the general discounted value iteration algorithm. Based on the monotonicity of the value function sequence, the method to judge the stability of the controlled system is established during the iteration process. In addition, once some stability conditions are satisfied at a certain iteration step, all control policies after this iteration step are stabilizing. Furthermore, combined with the undiscounted optimal control problem, the practical rule of how to select an appropriate discount factor is constructed. Finally, several simulation examples with physical backgrounds are conducted to demonstrate the present theoretical results.

[1]  Ding Wang,et al.  Neuro-Optimal Trajectory Tracking With Value Iteration of Discrete-Time Nonlinear Dynamics , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Derong Liu,et al.  Neural-network-based discounted optimal control via an integrated value iteration with accuracy guarantee , 2021, Neural Networks.

[3]  Junfei Qiao,et al.  Data-Driven Iterative Adaptive Critic Control Toward an Urban Wastewater Treatment Plant , 2021, IEEE Transactions on Industrial Electronics.

[4]  Ding Wang,et al.  Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application , 2021, Neural Networks.

[5]  S. Tong,et al.  Observer-Based Neuro-Adaptive Optimized Control of Strict-Feedback Nonlinear Systems With State Constraints , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Derong Liu,et al.  Event-Triggered Adaptive Critic Control Design for Discrete-Time Constrained Nonlinear Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Junfei Qiao,et al.  An Approximate Neuro-Optimal Solution of Discounted Guaranteed Cost Control Design , 2020, IEEE Transactions on Cybernetics.

[8]  Qinglai Wei,et al.  Adaptive Critic Learning for Constrained Optimal Event-Triggered Control With Discounted Cost , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Jun Yan,et al.  Data-based composite control design with critic intelligence for a wastewater treatment platform , 2020, Artificial Intelligence Review.

[10]  Junfei Qiao,et al.  Self-Learning Optimal Regulation for Discrete-Time Nonlinear Systems Under Event-Driven Formulation , 2020, IEEE Transactions on Automatic Control.

[11]  Yixin Yin,et al.  Safe Intermittent Reinforcement Learning With Static and Dynamic Event Generators , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Derong Liu,et al.  Event-triggered constrained control with DHP implementation for nonaffine discrete-time systems , 2020, Inf. Sci..

[13]  Haibo He,et al.  Decentralized Event-Triggered Control for a Class of Nonlinear-Interconnected Systems Using Reinforcement Learning , 2019, IEEE Transactions on Cybernetics.

[14]  Haibo He,et al.  Approximate Dynamic Programming for Nonlinear-Constrained Optimizations , 2019, IEEE Transactions on Cybernetics.

[15]  Yixin Yin,et al.  Data-Driven Robust Control of Discrete-Time Uncertain Linear Systems via Off-Policy Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Ali Heydari,et al.  Stability Analysis of Optimal Adaptive Control Under Value Iteration Using a Stabilizing Initial Policy , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Frank L. Lewis,et al.  Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Convergence Analysis , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[18]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Ali Heydari,et al.  Stability Analysis of Optimal Adaptive Control Using Value Iteration With Approximation Errors , 2017, IEEE Transactions on Automatic Control.

[20]  Haibo He,et al.  Event-Driven Nonlinear Discounted Optimal Regulation Involving a Power System Application , 2017, IEEE Transactions on Industrial Electronics.

[21]  Haibo He,et al.  Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.

[22]  Yang Xiong,et al.  Adaptive Dynamic Programming with Applications in Optimal Control , 2017 .

[23]  Derong Liu,et al.  Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.

[24]  Frank L. Lewis,et al.  Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data , 2015, IEEE Transactions on Cybernetics.

[25]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[26]  Frank L. Lewis,et al.  Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics , 2014, Autom..

[27]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Frank L. Lewis,et al.  Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[29]  Derong Liu,et al.  Optimal control for discrete-time affine non-linear systems using general value iteration , 2012 .

[30]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[31]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[32]  David J. Hill,et al.  Transient stability enhancement and voltage regulation of power systems , 1993 .

[33]  Derong Liu,et al.  Generalized value iteration for discounted optimal control with stability analysis , 2021, Syst. Control. Lett..

[34]  Ali Heydari,et al.  Optimal Codesign of Control Input and Triggering Instants for Networked Control Systems Using Adaptive Dynamic Programming , 2019, IEEE Transactions on Industrial Electronics.

[35]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.