Approximate Dynamic Programming for Nonlinear-Constrained Optimizations

In this paper, we study the constrained optimization problem of a class of uncertain nonlinear interconnected systems. First, we prove that the solution of the constrained optimization problem can be obtained through solving an array of optimal control problems of constrained auxiliary subsystems. Then, under the framework of approximate dynamic programming, we present a simultaneous policy iteration (SPI) algorithm to solve the Hamilton-Jacobi-Bellman equations corresponding to the constrained auxiliary subsystems. By building an equivalence relationship, we demonstrate the convergence of the SPI algorithm. Meanwhile, we implement the SPI algorithm via an actor-critic structure, where actor networks are used to approximate optimal control policies and critic networks are applied to estimate optimal value functions. By using the least squares method and the Monte Carlo integration technique together, we are able to determine the weight vectors of actor and critic networks. Finally, we validate the developed control method through the simulation of a nonlinear interconnected plant.

[1]  Sarangapani Jagannathan,et al.  Decentralized Optimal Control of a Class of Interconnected Nonlinear Discrete-Time Systems by Using Online Hamilton-Jacobi-Bellman Formulation , 2011, IEEE Transactions on Neural Networks.

[2]  Huaguang Zhang,et al.  Fault-Tolerant Controller Design for a Class of Nonlinear MIMO Discrete-Time Systems via Online Reinforcement Learning Algorithm , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[3]  Charles R. Johnson,et al.  Matrix Analysis, 2nd Ed , 2012 .

[4]  Xiaohong Cui,et al.  H∞ control with constrained input for completely unknown nonlinear systems using data-driven reinforcement learning method , 2017, Neurocomputing.

[5]  Derong Liu,et al.  Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning , 2016, Inf. Sci..

[6]  Derong Liu,et al.  Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints , 2015, IEEE Transactions on Cybernetics.

[7]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Avimanyu Sahoo,et al.  Near Optimal Event-Triggered Control of Nonlinear Discrete-Time Systems Using Neurodynamic Programming , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Shaocheng Tong,et al.  Observer-Based Adaptive Fuzzy Decentralized Optimal Control Design for Strict-Feedback Nonlinear Large-Scale Systems , 2018, IEEE Transactions on Fuzzy Systems.

[10]  Dongbin Zhao,et al.  Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Frank L. Lewis,et al.  Adaptive Suboptimal Output-Feedback Control for Linear Systems Using Integral Reinforcement Learning , 2015, IEEE Transactions on Control Systems Technology.

[12]  Frank L. Lewis,et al.  Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[13]  He Jiang,et al.  Neural-Network-Based Robust Control Schemes for Nonlinear Multiplayer Systems With Uncertainties via Adaptive Dynamic Programming , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[14]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[15]  K. Vamvoudakis,et al.  Event‐triggered optimal tracking control of nonlinear systems , 2017 .

[16]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[17]  Huaguang Zhang,et al.  Integral reinforcement learning based decentralized optimal tracking control of unknown nonlinear large-scale interconnected systems with constrained-input , 2019, Neurocomputing.

[18]  Zhong-Ping Jiang,et al.  Decentralized Adaptive Optimal Control of Large-Scale Systems With Application to Power Systems , 2015, IEEE Transactions on Industrial Electronics.

[19]  Ali Saberi,et al.  On optimality of decentralized control for a class of nonlinear interconnected systems , 1988, Autom..

[20]  Frank L. Lewis,et al.  Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Convergence Analysis , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[21]  Frank L. Lewis,et al.  Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Tim B. Swartz,et al.  Approximating Integrals Via Monte Carlo and Deterministic Methods , 2000 .

[23]  Kyriakos G. Vamvoudakis,et al.  Asymptotically Stable Adaptive–Optimal Control Algorithm With Saturating Actuators and Relaxed Persistence of Excitation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Frank L. Lewis,et al.  Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..

[25]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[26]  Derong Liu,et al.  Decentralized Stabilization for a Class of Continuous-Time Nonlinear Interconnected Systems Using Online Learning Optimal Control Approach , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[28]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[29]  Haibo He,et al.  An Event-Triggered ADP Control Approach for Continuous-Time System With Unknown Internal States , 2017, IEEE Transactions on Cybernetics.

[30]  Lubomír Bakule,et al.  Decentralized control: An overview , 2008, Annu. Rev. Control..

[31]  Indra Narayan Kar,et al.  Suboptimal robust stabilization of discrete-time mismatched nonlinear system , 2018, IEEE/CAA Journal of Automatica Sinica.

[32]  Warren E. Dixon,et al.  Model-based reinforcement learning for infinite-horizon approximate optimal tracking , 2014, 53rd IEEE Conference on Decision and Control.

[33]  Yang Xiong,et al.  Adaptive Dynamic Programming with Applications in Optimal Control , 2017 .

[34]  Tingwen Huang,et al.  Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..

[35]  Frank L. Lewis,et al.  Game Theory-Based Control System Algorithms with Real-Time Reinforcement Learning: How to Solve Multiplayer Games Online , 2017, IEEE Control Systems.

[36]  Feng-Yi Lin Robust Control Design: An Optimal Control Approach , 2007 .

[37]  Hyunjoong Kim,et al.  Functional Analysis I , 2017 .

[38]  Aiguo Song,et al.  Decentralized adaptive optimal stabilization of nonlinear systems with matched interconnections , 2018, Soft Comput..

[39]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[40]  Yixin Yin,et al.  Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Frank L. Lewis,et al.  Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems , 2014, Autom..

[42]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[43]  Huai-Ning Wu,et al.  Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control , 2017, IEEE Transactions on Cybernetics.

[44]  Huaguang Zhang,et al.  Decentralized adaptive tracking control scheme for nonlinear large-scale interconnected systems via adaptive dynamic programming , 2017, Neurocomputing.

[45]  Murad Abu-Khalaf,et al.  Nonlinear H2/H∞ Constrained Feedback Control: A Practical Design Approach Using Neural Networks , 2007 .

[46]  Haibo He,et al.  Gr-GDHP: A New Architecture for Globalized Dual Heuristic Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[47]  Haibo He,et al.  Adaptive critic designs for optimal control of uncertain nonlinear systems with unmatched interconnections , 2018, Neural Networks.

[48]  Chaomin Luo,et al.  Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms , 2017, IEEE Transactions on Cybernetics.

[49]  Haibo He,et al.  Adaptive Dynamic Programming for Decentralized Stabilization of Uncertain Nonlinear Large-Scale Systems With Mismatched Interconnections , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[50]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming , 2017 .