Continuous-Time Distributed Policy Iteration for Multicontroller Nonlinear Systems

In this article, a novel distributed policy iteration algorithm is established for infinite horizon optimal control problems of continuous-time nonlinear systems. In each iteration of the developed distributed policy iteration algorithm, only one controller’s control law is updated and the other controllers’ control laws remain unchanged. The main contribution of the present algorithm is to improve the iterative control law one by one, instead of updating all the control laws in each iteration of the traditional policy iteration algorithms, which effectively releases the computational burden in each iteration. The properties of distributed policy iteration algorithm for continuous-time nonlinear systems are analyzed. The admissibility of the present methods has also been analyzed. Monotonicity, convergence, and optimality have been discussed, which show that the iterative value function is nonincreasingly convergent to the solution of the Hamilton–Jacobi–Bellman equation. Finally, numerical simulations are conducted to illustrate the effectiveness of the proposed method.

[1]  Haibo He,et al.  Adaptive Dynamic Programming for Robust Regulation and Its Application to Power Systems , 2018, IEEE Transactions on Industrial Electronics.

[2]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[3]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[4]  Derong Liu,et al.  Adaptive Constrained Optimal Control Design for Data-Based Nonlinear Discrete-Time Systems With Critic-Only Structure , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Yuzhu Huang Optimal guaranteed cost control of uncertain non-linear systems using adaptive dynamic programming with concurrent learning , 2018 .

[6]  Frank L. Lewis,et al.  Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Convergence Analysis , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Huaguang Zhang,et al.  Finite-Horizon H∞ Tracking Control for Unknown Nonlinear Systems With Saturating Actuators , 2018, IEEE Trans. Neural Networks Learn. Syst..

[8]  Matthieu Mastio,et al.  Distributed Agent-Based Traffic Simulations , 2018, IEEE Intelligent Transportation Systems Magazine.

[9]  Frank L. Lewis,et al.  Multi-agent differential graphical games , 2011, Proceedings of the 30th Chinese Control Conference.

[10]  Yu Liu,et al.  Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming , 2017, IEEE/CAA Journal of Automatica Sinica.

[11]  Huaguang Zhang,et al.  Optimal Fault-Tolerant Control for Discrete-Time Nonlinear Strict-Feedback Systems Based on Adaptive Critic Design , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Qinglai Wei,et al.  Continuous-Time Time-Varying Policy Iteration , 2020, IEEE Transactions on Cybernetics.

[13]  Richard S. Sutton,et al.  A Menu of Designs for Reinforcement Learning Over Time , 1995 .

[14]  Derong Liu,et al.  Data-Based Optimal Control for Weakly Coupled Nonlinear Systems Using Policy Iteration , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[15]  Xiangnan Zhong,et al.  GrHDP Solution for Optimal Consensus Control of Multiagent Discrete-Time Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[16]  Dan Ye,et al.  Distributed Adaptive Event-Triggered Fault-Tolerant Consensus of Multiagent Systems With General Linear Dynamics , 2019, IEEE Transactions on Cybernetics.

[17]  Haibo He,et al.  Approximate Dynamic Programming for Nonlinear-Constrained Optimizations , 2019, IEEE Transactions on Cybernetics.

[18]  Shuai Li,et al.  Distributed Task Allocation of Multiple Robots: A Control Perspective , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[19]  Li Fei,et al.  Using approximate dynamic programming for multi-ESM scheduling to track ground moving targets , 2018 .

[20]  Haibo He,et al.  Adaptive Dynamic Programming for Decentralized Stabilization of Uncertain Nonlinear Large-Scale Systems With Mismatched Interconnections , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[21]  Haibo He,et al.  Decentralized Event-Triggered Control for a Class of Nonlinear-Interconnected Systems Using Reinforcement Learning , 2019, IEEE Transactions on Cybernetics.

[22]  Frank L. Lewis,et al.  Error-Tolerant Iterative Adaptive Dynamic Programming for Optimal Renewable Home Energy Scheduling and Battery Management , 2017, IEEE Transactions on Industrial Electronics.

[23]  Qinglai Wei,et al.  Discrete-Time Impulsive Adaptive Dynamic Programming , 2020, IEEE Transactions on Cybernetics.

[24]  Chaomin Luo,et al.  Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms , 2017, IEEE Transactions on Cybernetics.

[25]  Huaguang Zhang,et al.  Iterative ADP learning algorithms for discrete-time multi-player games , 2018, Artificial Intelligence Review.

[26]  Shaocheng Tong,et al.  Observer-Based Adaptive Fuzzy Fault-Tolerant Optimal Control for SISO Nonlinear Systems , 2019, IEEE Transactions on Cybernetics.

[27]  Juan C. Vasquez,et al.  Distributed Nonlinear Control With Event-Triggered Communication to Achieve Current-Sharing and Voltage Regulation in DC Microgrids , 2018, IEEE Transactions on Power Electronics.

[28]  Khac Duc Do Stability in probability and inverse optimal control of evolution systems driven by Levy processes , 2020, IEEE/CAA Journal of Automatica Sinica.

[29]  Derong Liu,et al.  Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[30]  Changchun Hua,et al.  Output Feedback Distributed Containment Control for High-Order Nonlinear Multiagent Systems , 2017, IEEE Transactions on Cybernetics.

[31]  Frank L. Lewis,et al.  Mixed Iterative Adaptive Dynamic Programming for Optimal Battery Energy Control in Smart Residential Microgrids , 2017, IEEE Transactions on Industrial Electronics.

[32]  Jianguo Zhou,et al.  Distributed Optimal Energy Management for Energy Internet , 2017, IEEE Transactions on Industrial Informatics.

[33]  Yun Ho Choi,et al.  Minimal-Approximation-Based Distributed Consensus Tracking of a Class of Uncertain Nonlinear Multiagent Systems With Unknown Control Directions , 2017, IEEE Transactions on Cybernetics.

[34]  Kevin M. Passino,et al.  Decentralized adaptive control of nonlinear systems using radial basis neural networks , 1999, IEEE Trans. Autom. Control..

[35]  Dimitri P. Bertsekas,et al.  Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Carlo Fischione,et al.  A Semidistributed Approach for the Feasible Min-Max Fair Agent-Assignment Problem With Privacy Guarantees , 2018, IEEE Transactions on Control of Network Systems.

[37]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[38]  Qichao Zhang,et al.  Policy Iteration for $H_\infty $ Optimal Control of Polynomial Nonlinear Systems via Sum of Squares Programming , 2018, IEEE Transactions on Cybernetics.

[39]  Huaguang Zhang,et al.  Optimal Output Regulation for Heterogeneous Multiagent Systems via Adaptive Dynamic Programming , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[40]  He Jiang,et al.  Neural-Network-Based Robust Control Schemes for Nonlinear Multiplayer Systems With Uncertainties via Adaptive Dynamic Programming , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[41]  Zidong Wang,et al.  Nonfragile Near-Optimal Control of Stochastic Time-Varying Multiagent Systems With Control- and State-Dependent Noises , 2019, IEEE Transactions on Cybernetics.

[42]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[43]  Fei Chen,et al.  A Connection Between Dynamic Region-Following Formation Control and Distributed Average Tracking , 2018, IEEE Transactions on Cybernetics.

[44]  Bo Zhao,et al.  Optimal neuro-control strategy for nonlinear systems with asymmetric input constraints , 2020, IEEE/CAA Journal of Automatica Sinica.

[45]  Chuanqiang Lian,et al.  Learning-Based Predictive Control for Discrete-Time Nonlinear Systems With Stochastic Disturbances , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Xuehua Li,et al.  Distributed Adaptive Neural Control for Stochastic Nonlinear Multiagent Systems , 2017, IEEE Transactions on Cybernetics.

[47]  Rudy R. Negenborn,et al.  Robust Distributed Predictive Control of Waterborne AGVs—A Cooperative and Cost-Effective Approach , 2018, IEEE Transactions on Cybernetics.

[48]  Lingxiao Wang,et al.  Optimal Elevator Group Control via Deep Asynchronous Actor–Critic Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[49]  Daqi Zhu,et al.  An Adaptive SOM Neural Network Method for Distributed Formation Control of a Group of AUVs , 2018, IEEE Transactions on Industrial Electronics.

[50]  Qinglai Wei,et al.  Discrete-Time Stable Generalized Self-Learning Optimal Control With Approximation Errors , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[52]  Rui Yang,et al.  Asymptotical Cooperative Tracking Control for Unknown High-Order Multi-Agent Systems via Distributed Adaptive Critic Design , 2018, IEEE Access.

[53]  Derong Liu,et al.  Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[54]  Yun Shang,et al.  Consensus Tracking Control for Distributed Nonlinear Multiagent Systems via Adaptive Neural Backstepping Approach , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[55]  Frank L. Lewis,et al.  Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[56]  Haibo He,et al.  Data-Driven Finite-Horizon Approximate Optimal Control for Discrete-Time Nonlinear Systems Using Iterative HDP Approach , 2018, IEEE Transactions on Cybernetics.