H ∞ Control Synthesis for Linear Parabolic PDE Systems with Model-Free Policy Iteration

The H∞ control problem is considered for linear parabolic partial differential equation PDE systems with completely unknown system dynamics. We propose a model-free policy iteration PI method for learning the H∞ control policy by using measured system data without system model information. First, a finite-dimensional system of ordinary differential equation ODE is derived, which accurately describes the dominant dynamics of the parabolic PDE system. Based on the finite-dimensional ODE model, the H∞ control problem is reformulated, which is theoretically equivalent to solving an algebraic Riccati equation ARE. To solve the ARE without system model information, we propose a least-square based model-free PI approach by using real system data. Finally, the simulation results demonstrate the effectiveness of the developed model-free PI method.

[1]  A. Schaft,et al.  L2-Gain and Passivity in Nonlinear Control , 1999 .

[2]  P. Christofides,et al.  Finite-dimensional approximation and control of non-linear parabolic PDE systems , 2000 .

[3]  Huai‐Ning Wu,et al.  Computationally efficient simultaneous policy update algorithm for nonlinear H∞ state feedback control with Galerkin's method , 2013 .

[4]  Eugenio Schuster,et al.  Sequential linear quadratic control of bilinear parabolic PDEs based on POD model reduction , 2011, Autom..

[5]  Huai-Ning Wu,et al.  Online policy iteration algorithm for optimal control of linear hyperbolic PDE systems , 2012 .

[6]  Tingwen Huang,et al.  Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.

[7]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[8]  A. Schaft L2-Gain and Passivity Techniques in Nonlinear Control. Lecture Notes in Control and Information Sciences 218 , 1996 .

[9]  Brian D. O. Anderson,et al.  A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H∞ control , 2009, Autom..

[10]  Bor-Sen Chen,et al.  A Fuzzy Approach for Robust Reference-Tracking-Control Design of Nonlinear Distributed Parameter Time-Delayed Systems and Its Application , 2010, IEEE Transactions on Fuzzy Systems.

[11]  Han-Xiong Li,et al.  Data-based Suboptimal Neuro-control Design with Reinforcement Learning for Dissipative Spatially Distributed Processes , 2014 .

[12]  Huai-Ning Wu,et al.  L2 disturbance attenuation for highly dissipative nonlinear spatially distributed processes via HJI approach , 2014 .

[13]  Frank L. Lewis,et al.  Adaptive dynamic programming for online solution of a zero-sum differential game , 2011 .

[14]  Frank L. Lewis,et al.  Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control , 2007, Autom..

[15]  Han-Xiong Li,et al.  Adaptive Optimal Control of Highly Dissipative Nonlinear Spatially Distributed Processes With Neuro-Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Huai-Ning Wu,et al.  Heuristic Dynamic Programming Algorithm for Optimal Control Design of Linear Continuous-Time Hyperbolic PDE Systems , 2012 .

[17]  Huai-Ning Wu,et al.  Simultaneous policy update algorithms for learning the solution of linear continuous-time H∞ state feedback control , 2013, Inf. Sci..

[18]  Huai-Ning Wu,et al.  Neural Network Based Online Simultaneous Policy Update Algorithm for Solving the HJI Equation in Nonlinear $H_{\infty}$ Control , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Huai-Ning Wu,et al.  Approximate Optimal Control Design for Nonlinear One-Dimensional Parabolic PDE Systems Using Empirical Eigenfunctions and Neural Network , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Tamer Başar,et al.  H1-Optimal Control and Related Minimax Design Problems , 1995 .

[21]  Brian D. O. Anderson,et al.  Computing the Positive Stabilizing Solution to Algebraic Riccati Equations With an Indefinite Quadratic Term via a Recursive Method , 2008, IEEE Transactions on Automatic Control.

[22]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[23]  Jae Young Lee,et al.  Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems , 2012, Autom..

[24]  Bor-Sen Chen,et al.  Fuzzy State-Space Modeling and Robust Observer-Based Control Design for Nonlinear Partial Differential Systems , 2009, IEEE Transactions on Fuzzy Systems.

[25]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[26]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[27]  Frank L. Lewis,et al.  Online solution of nonlinear two‐player zero‐sum games using synchronous policy iteration , 2012 .