Four nonlinear multi-input multi-output ADHDP constructions and algorithms based on topology principle

In this paper, Four action-dependent heuristic dynamic programming control methods are presented for nonlinear multi-input-multi-output system with different characters based on the topology principle. These four methods are the action-network extension method, the sub-network method, the cascaded action-network method and the combined method. The derivation procedure and computing formulas of these methods are also derived. In it, the action-network extension method is mainly used for the conditions where the multi-output variables have the same orders of magnitude and a naturally coupled relationship. The sub-network method can nearly be applied in all cases and can solve the problem that the multi-output variables have different orders of magnitude. The cascaded action-network method is utilized when the multiple input variables have explicit cascaded relationships. The combined method can be used to control some highly regarded systems. Thus, these four methods can almost be used to satisfy all the design requirements of the nonlinear multi-input-multi-output control systems. The latter can refer to and select these methods as well as formulas for their control systems according to the research results to achieve a better control effect.

[1]  Chuan-Kai Lin,et al.  Adaptive critic autopilot design of Bank-to-turn missiles using fuzzy basis function networks , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Khan M. Iftekharuddin,et al.  A Biologically Inspired Dynamic Model for Object Recognition , 2007 .

[3]  Derong Liu,et al.  Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.

[4]  Frank L. Lewis,et al.  Error-Tolerant Iterative Adaptive Dynamic Programming for Optimal Renewable Home Energy Scheduling and Battery Management , 2017, IEEE Transactions on Industrial Electronics.

[5]  Radhakant Padhi,et al.  A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems , 2006, Neural Networks.

[6]  Sarangapani Jagannathan,et al.  Near Optimal Output-Feedback Control of Nonlinear Discrete-time Systems in Nonstrict Feedback Form with Application to Engines , 2007, 2007 International Joint Conference on Neural Networks.

[7]  Jennie Si,et al.  Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[8]  Jennie Si,et al.  Helicopter trimming and tracking control using direct neural dynamic programming , 2003, IEEE Trans. Neural Networks.

[9]  Jay H. Lee,et al.  Approximate dynamic programming based approach to process control and scheduling , 2006, Comput. Chem. Eng..

[10]  E.V. Kampen,et al.  Online Adaptive Critic Flight Control using Approximated Plant Dynamics , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[11]  D. Ernst,et al.  Power systems stability control: reinforcement learning framework , 2004, IEEE Transactions on Power Systems.

[12]  Dimitri P. Bertsekas,et al.  Missile defense and interceptor allocation by neuro-dynamic programming , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[13]  Derong Liu,et al.  Action-dependent adaptive critic designs , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[14]  Derong Liu,et al.  ADHDP for the pH Value Control in the Clarifying Process of Sugar Cane Juice , 2008, ISNN.

[15]  Chao Lu,et al.  Direct Heuristic Dynamic Programming for Damping Oscillations in a Large Power System , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Derong Liu,et al.  Adaptive Critic Learning Techniques for Engine Torque and Air–Fuel Ratio Control , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Jie Ma,et al.  The derivation of iterative convergence calculation for a nonlinear MIMO approximate dynamic programming approach , 2013, Appl. Math. Comput..

[18]  Frank L. Lewis,et al.  Aircraft Control and Simulation , 1992 .

[19]  Frank L. Lewis,et al.  Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis , 2017, IEEE Transactions on Cybernetics.

[20]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[21]  Derong Liu,et al.  Discrete-Time Optimal Control via Local Policy Iteration Adaptive Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[22]  Yihua Liu,et al.  A data‐driven approximate solution to the model‐free HJB equation , 2018 .

[23]  P.J. Werbos,et al.  Using ADP to Understand and Replicate Brain Intelligence: the Next Level Design , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[24]  Robert F. Stengel,et al.  Online Adaptive Critic Flight Control , 2004 .

[25]  Jie Ma,et al.  An approximate dynamic programming method for multi‐input multi‐output nonlinear system , 2013 .