Heuristic Dynamic Programming Algorithm for Optimal Control Design of Linear Continuous-Time Hyperbolic PDE Systems

This work considers the optimal control problem of linear continuous-time hyperbolic partial differential equation (PDE) systems with partially unknown system dynamics. To respect the infinite-dimensional nature of the hyperbolic PDE system, the problem can be reduced to finding a solution of the space-dependent Riccati differential equation (SDRDE), which requires the full system model. Therefore, a heuristic dynamic programming (HDP) algorithm is proposed to achieve online optimal control of the hyperbolic PDE system, which online collects data accrued along system trajectories and learns the solution of the SDRDE without requiring the internal system dynamics. The convergence of HDP algorithm is established by showing that the HDP algorithm generates a nondecreasing sequence which uniformly converges to the solution of the SDRDE. For implementation purposes, the HDP algorithm is realized by developing an approximate approach based on the method of weighted residuals. Finally, the application on a steam-jacketed tubular heat exchanger demonstrates the effectiveness of the developed control approach.

[1]  Denis Dochain,et al.  Optimal temperature control of a steady‐state exothermic plug‐flow reactor , 2000 .

[2]  Serge Alinhac,et al.  Hyperbolic Partial Differential Equations , 2009 .

[3]  Jinhoon Choi,et al.  Model Predictive Control of Cocurrent First-Order Hyperbolic PDE Systems , 2005 .

[4]  Alexandre M. Bayen,et al.  Adjoint-based control of a new eulerian network model of air traffic flow , 2006, IEEE Transactions on Control Systems Technology.

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  Han-Xiong Li,et al.  Distributed Fuzzy Control Design of Nonlinear Hyperbolic PDE Systems With Application to Nonisothermal Plug-Flow Reactor , 2011, IEEE Transactions on Fuzzy Systems.

[7]  Hideki Sano,et al.  Exponential stability of a mono-tubular heat exchanger equation with output feedback , 2003, Syst. Control. Lett..

[8]  Denis Dochain,et al.  Optimal LQ-Feedback Regulation of a Nonisothermal Plug Flow Reactor Model by Spectral Factorization , 2007, IEEE Transactions on Automatic Control.

[9]  P. Daoutidis,et al.  Feedback Control of Hyperbolic PDE Systems , 1996 .

[10]  P. Daoutidis,et al.  Robust control of hyperbolic PDE systems , 1998 .

[11]  Xavier Litrico,et al.  H/sub /spl infin// control of an irrigation canal pool with a mixed control politics , 2006, IEEE Transactions on Control Systems Technology.

[12]  Prodromos Daoutidis,et al.  Control of hot spots in plug flow reactors , 2002 .

[13]  Xavier Litrico,et al.  Boundary control of hyperbolic conservation laws using a frequency domain approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[14]  Panagiotis D. Christofides,et al.  Output feedback control of parabolic PDE systems with nonlinear spatial differential operators , 1999 .

[15]  Miroslav Krstic,et al.  Backstepping boundary control for first order hyperbolic PDEs and application to systems with actuator and sensor delays , 2007, CDC.

[16]  B. Finlayson The method of weighted residuals and variational principles : with application in fluid mechanics, heat and mass transfer , 1972 .

[17]  J. F. Forbes,et al.  Model predictive control for quasilinear hyperbolic distributed parameter systems , 2004 .

[18]  Frank L. Lewis,et al.  Optimal Control , 1986 .

[19]  Chenkun Qi,et al.  A multi-channel spatio-temporal Hammerstein modeling approach for nonlinear distributed parameter processes , 2009 .

[20]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21]  Qinmin Yang,et al.  Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  Joseph J. Winkin,et al.  LQ control design of a class of hyperbolic PDE systems: Application to fixed-bed reactor , 2009, Autom..

[23]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[24]  Zhiyong Chen,et al.  Nonlinear Laguerre–Volterra observer‐controller and its application to process control , 2010 .

[25]  Denis Dochain,et al.  Dynamical analysis of distributed parameter tubular reactors , 2000, Autom..

[26]  Radhakant Padhi,et al.  Proper orthogonal decomposition based optimal neurocontrol synthesis of a chemical reactor process using approximate dynamic programming , 2003, Neural Networks.

[27]  Alain Bensoussan,et al.  Representation and Control of Infinite Dimensional Systems, 2nd Edition , 2007, Systems and control.

[28]  Shaoyuan Li,et al.  Time/Space-Separation-Based SVM Modeling for Nonlinear Distributed Parameter Processes , 2011 .

[29]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[30]  Chenkun Qi,et al.  Incremental Modeling of Nonlinear Distributed Parameter Processes via Spatiotemporal Kernel Series Expansion , 2009 .

[31]  Han-Xiong Li,et al.  A Galerkin/Neural-Network-Based Design of Guaranteed Cost Control for Nonlinear Distributed Parameter Systems , 2008, IEEE Transactions on Neural Networks.

[32]  S. Ravindran A reduced-order approach for optimal control of fluids using proper orthogonal decomposition , 2000 .

[33]  Andrey Smyshlyaev,et al.  Boundary control of an anti-stable wave equation with anti-damping on the uncontrolled boundary , 2009, ACC.

[34]  Han-Xiong Li,et al.  Greatly enhancing the modeling accuracy for distributed parameter systems by nonlinear time/space separation , 2007 .

[35]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[36]  Martin A. Riedmiller,et al.  Reinforcement learning in feedback control , 2011, Machine Learning.

[37]  Huai-Ning Wu,et al.  Approximate Optimal Control Design for Nonlinear One-Dimensional Parabolic PDE Systems Using Empirical Eigenfunctions and Neural Network , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Hans Zwart,et al.  An Introduction to Infinite-Dimensional Linear Systems Theory , 1995, Texts in Applied Mathematics.

[39]  Xavier Litrico,et al.  Boundary control of hyperbolic conservation laws using a frequency domain approach , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[40]  P. Christofides,et al.  Finite-dimensional approximation and control of non-linear parabolic PDE systems , 2000 .

[41]  Bao-Zhu Guo,et al.  The Stabilization of a One-Dimensional Wave Equation by Boundary Feedback With Noncollocated Observation , 2007, IEEE Transactions on Automatic Control.

[42]  R. Courant,et al.  Methods of Mathematical Physics , 1962 .

[43]  Georges Bastin,et al.  A Strict Lyapunov Function for Boundary Control of Hyperbolic Systems of Conservation Laws , 2007, IEEE Transactions on Automatic Control.

[44]  Alexandre M. Bayen,et al.  Exponential Stability of Switched Linear Hyperbolic Initial-Boundary Value Problems , 2011, IEEE Transactions on Automatic Control.

[45]  Panagiotis D. Christofides,et al.  Robust output feedback control of quasi-linear parabolic PDE systems , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).

[46]  H. Sira-Ramírez Distributed sliding mode control in systems described by linear and quasilinear partial differential equations , 1989 .

[47]  Panagiotis D. Christofides,et al.  Predictive output feedback control of parabolic PDEs , 2006, 2006 American Control Conference.

[48]  Luigi Fortuna,et al.  Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .

[49]  Xue Xiao Output feedback control of nonlinear systems , 2008 .