Adaptive Optimal Control of Highly Dissipative Nonlinear Spatially Distributed Processes With Neuro-Dynamic Programming

Highly dissipative nonlinear partial differential equations (PDEs) are widely employed to describe the system dynamics of industrial spatially distributed processes (SDPs). In this paper, we consider the optimal control problem of the general highly dissipative SDPs, and propose an adaptive optimal control approach based on neuro-dynamic programming (NDP). Initially, Karhunen-Loève decomposition is employed to compute empirical eigenfunctions (EEFs) of the SDP based on the method of snapshots. These EEFs together with singular perturbation technique are then used to obtain a finite-dimensional slow subsystem of ordinary differential equations that accurately describes the dominant dynamics of the PDE system. Subsequently, the optimal control problem is reformulated on the basis of the slow subsystem, which is further converted to solve a Hamilton-Jacobi-Bellman (HJB) equation. HJB equation is a nonlinear PDE that has proven to be impossible to solve analytically. Thus, an adaptive optimal control method is developed via NDP that solves the HJB equation online using neural network (NN) for approximating the value function; and an online NN weight tuning law is proposed without requiring an initial stabilizing control policy. Moreover, by involving the NN estimation error, we prove that the original closed-loop PDE system with the adaptive optimal control policy is semiglobally uniformly ultimately bounded. Finally, the developed method is tested on a nonlinear diffusion-convection-reaction process and applied to a temperature cooling fin of high-speed aerospace vehicle, and the achieved results show its effectiveness.

[1]  Antonios Armaou,et al.  Finite-dimensional control of nonlinear parabolic PDE systems with time-dependent spatial domains using empirical eigenfunctions , 2001 .

[2]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  P. Christofides,et al.  Crystal temperature control in the Czochralski crystal growth process , 2001 .

[4]  Derong Liu,et al.  Decentralized Stabilization for a Class of Continuous-Time Nonlinear Interconnected Systems Using Online Learning Optimal Control Approach , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[5]  P. Olver Nonlinear Systems , 2013 .

[6]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[7]  Frank L. Lewis,et al.  Optimal Control , 1986 .

[8]  Antonios Armaou,et al.  Robust control of parabolic PDE systems with time-dependent spatial domains , 2001, Autom..

[9]  Jean-Jacques E. Slotine,et al.  Neural Network Control of Unknown Nonlinear Systems , 1989, 1989 American Control Conference.

[10]  Panagiotis D. Christofides,et al.  Optimal control of diffusion-convection-reaction processes using reduced-order models , 2008, Comput. Chem. Eng..

[11]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[12]  Danil V. Prokhorov,et al.  Optimal neurocontrollers for discretized distributed parameter systems , 2003, Proceedings of the 2003 American Control Conference, 2003..

[13]  Panagiotis D. Christofides,et al.  Optimization of transport-reaction processes using nonlinear model reduction , 2000 .

[14]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[15]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[16]  M. Willis,et al.  ADVANCED PROCESS CONTROL , 2005 .

[17]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Tao Zhang,et al.  Stable Adaptive Neural Network Control , 2001, The Springer International Series on Asian Studies in Computer and Information Science.

[19]  Tingwen Huang,et al.  Data-based approximate policy iteration for nonlinear continuous-time optimal control design , 2013, ArXiv.

[20]  Huai-Ning Wu,et al.  Approximate Optimal Control Design for Nonlinear One-Dimensional Parabolic PDE Systems Using Empirical Eigenfunctions and Neural Network , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Derong Liu,et al.  An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs , 2013, Inf. Sci..

[22]  B. R. Noack Turbulence, Coherent Structures, Dynamical Systems and Symmetry , 2013 .

[23]  P. Christofides,et al.  Dynamic optimization of dissipative PDE systems using nonlinear order reduction , 2002 .

[24]  Mingheng Li,et al.  An Input/Output Approach to the Optimal Transition Control of a Class of Distributed Chemical Reactors , 2007, 2007 American Control Conference.

[25]  Angelo Alessandri,et al.  Feedback Optimal Control of Distributed Parameter Systems by Using Finite-Dimensional Approximation Schemes , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Huai-Ning Wu,et al.  Heuristic Dynamic Programming Algorithm for Optimal Control Design of Linear Continuous-Time Hyperbolic PDE Systems , 2012 .

[27]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[28]  Derong Liu,et al.  Neural-Network-Based Optimal Control for a Class of Unknown Discrete-Time Nonlinear Systems Using Globalized Dual Heuristic Programming , 2012, IEEE Transactions on Automation Science and Engineering.

[29]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[30]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[31]  P. Christofides,et al.  Finite-dimensional approximation and control of non-linear parabolic PDE systems , 2000 .

[32]  Costas J. Spanos,et al.  Advanced process control , 1989 .

[33]  Haibo He,et al.  Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming , 2011, IEEE Transactions on Neural Networks.

[34]  P. Daoutidis,et al.  Robust control of hyperbolic PDE systems , 1998 .

[35]  Antonios Armaou,et al.  Output Feedback Control of Distributed Parameter Systems Using Adaptive Proper Orthogonal Decomposition , 2010 .

[36]  Han-Xiong Li,et al.  Data-based Suboptimal Neuro-control Design with Reinforcement Learning for Dissipative Spatially Distributed Processes , 2014 .

[37]  Huai-Ning Wu,et al.  L2 disturbance attenuation for highly dissipative nonlinear spatially distributed processes via HJI approach , 2014 .

[38]  Sarangapani Jagannathan,et al.  Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[40]  Han-Xiong Li,et al.  Spatio-Temporal Modeling of Nonlinear Distributed Parameter Systems , 2011 .

[41]  Shalabh Bhatnagar,et al.  Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.

[42]  Han-Xiong Li,et al.  Distributed Proportional–Spatial Derivative Control of Nonlinear Parabolic Systems via Fuzzy PDE Modeling Approach , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[43]  Dr. M. G. Worster Methods of Mathematical Physics , 1947, Nature.

[44]  Qinmin Yang,et al.  Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[46]  D. Liu,et al.  Adaptive Dynamic Programming for Finite-Horizon Optimal Control of Discrete-Time Nonlinear Systems With $\varepsilon$-Error Bound , 2011, IEEE Transactions on Neural Networks.

[47]  Antonios Armaou,et al.  Feedback control of dissipative PDE systems using adaptive model reduction , 2009 .

[48]  Eugenio Schuster,et al.  Sequential linear quadratic control of bilinear parabolic PDEs based on POD model reduction , 2011, Autom..

[49]  王碧祥 APPROXIMATE INERTIAL MANIFOLDS TO THE NAVIER-STOKES EQUATIONS , 1994 .

[50]  Tingwen Huang,et al.  Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.

[51]  M. Krstić,et al.  On global stabilization of Burgers' equation by boundary control , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[52]  Frank L. Lewis,et al.  Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[53]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[54]  Huaguang Zhang,et al.  Optimal Tracking Control for a Class of Nonlinear Discrete-Time Systems With Time Delays Based on Heuristic Dynamic Programming , 2011, IEEE Transactions on Neural Networks.

[55]  Randal W. Beard,et al.  Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..

[56]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[57]  Ali Heydari,et al.  Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[58]  Panagiotis D. Christofides,et al.  Robust Control of Parabolic PDE Systems , 1998 .

[59]  Eugenio Schuster,et al.  Optimal Tracking Control of Current Profile in Tokamaks , 2011, IEEE Transactions on Control Systems Technology.

[60]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[61]  Antonios Armaou,et al.  Output feedback control of dissipative PDE systems with partial sensor information based on adaptive model reduction , 2013 .

[62]  M. Balas The galerkin method and feedback control of linear distributed parameter systems , 1983 .

[63]  Joseph J. Winkin,et al.  LQ control design of a class of hyperbolic PDE systems: Application to fixed-bed reactor , 2009, Autom..

[64]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[65]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[66]  Han-Xiong Li,et al.  Adaptive Neural Control Design for Nonlinear Distributed Parameter Systems With Persistent Bounded Disturbances , 2009, IEEE Transactions on Neural Networks.

[67]  Huai-Ning Wu,et al.  Online policy iteration algorithm for optimal control of linear hyperbolic PDE systems , 2012 .

[68]  P. Christofides,et al.  Nonlinear and Robust Control of PDE Systems: Methods and Applications to Transport-Reaction Processes , 2002 .