Based on Q-Learning Optimal Tracking Control Schemes for Linear Itô Stochastic Systems With Markovian Jumps

In this brief, a <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning algorithm is designed to solve the optimal tracking control problem (OTCP) for stochastic system with Markovian jump. The problem is formulated by stochastic augmented system composed of the dynamic system and the reference trajectory system and then using ideas from <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning to provide a solution. Namely, a critic neural network (NN) is used to estimate the optimal cost function while an actor NN is used to approximate the optimal controller. Moreover, both the requirements of transition probability and system matrix are avoided via the designed <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning algorithm. Finally, we apply the designed algorithm in a single-area power system to track continuous sinusoidal waveforms, and the simulation result is provided to show the effectiveness and applicability of the algorithm.

[1]  Huaguang Zhang,et al.  Optimal Regulation Strategy for Nonzero-Sum Games of the Immune System Using Adaptive Dynamic Programming , 2021, IEEE Transactions on Cybernetics.

[2]  Huaguang Zhang,et al.  Full Information Control for Switched Neural Networks Subject to Fault and Disturbance , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Huaguang Zhang,et al.  Neurodynamic Programming and Tracking Control for Nonlinear Stochastic Systems by PI Algorithm , 2022, IEEE Transactions on Circuits and Systems II: Express Briefs.

[4]  Chang Xu,et al.  Frequency Regulation of Power Systems with a Wind Farm by Sliding-Mode-Based Design , 2022, IEEE/CAA Journal of Automatica Sinica.

[5]  Jun Zhao,et al.  Adaptive Identifier-Critic-Based Optimal Tracking Control for Nonlinear Systems With Experimental Validation , 2022, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[6]  F. Lewis,et al.  Robust Inverse Q-Learning for Continuous-Time Linear Systems in Adversarial Environments , 2021, IEEE Transactions on Cybernetics.

[7]  Yongming Li,et al.  Fuzzy Adaptive Optimal Consensus Fault-Tolerant Control for Stochastic Nonlinear Multiagent Systems , 2021, IEEE Transactions on Fuzzy Systems.

[8]  Q. Wei,et al.  Spiking Adaptive Dynamic Programming Based on Poisson Process for Discrete-Time Nonlinear Systems , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Tianyou Chai,et al.  Approximate Optimal Tracking Control of Nondifferentiable Signals for a Class of Continuous-Time Nonlinear Systems , 2020, IEEE Transactions on Cybernetics.

[10]  Huaguang Zhang,et al.  Fault-Tolerant Control for Stochastic Switched IT2 Fuzzy Uncertain Time-Delayed Nonlinear Systems , 2020, IEEE Transactions on Cybernetics.

[11]  G. Badran,et al.  Recovery of Photovoltaic Potential-Induced Degradation Utilizing Automatic Indirect Voltage Source , 2022, IEEE Transactions on Instrumentation and Measurement.

[12]  Ke Wang,et al.  Learning Control Supported by Dynamic Event Communication Applying to Industrial Systems , 2021, IEEE Transactions on Industrial Informatics.

[13]  Kun Zhang,et al.  Parallel Optimal Tracking Control Schemes for Mode-Dependent Control of Coupled Markov Jump Systems via Integral RL Method , 2020, IEEE Transactions on Automation Science and Engineering.

[14]  Kyriakos G. Vamvoudakis,et al.  Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach , 2017, Syst. Control. Lett..

[15]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[16]  Zoran Gajic,et al.  Monotonicity of algebraic Lyapunov iterations for optimal control of jump parameter linear systems , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).