Off-policy based adaptive dynamic programming method for nonzero-sum games on discrete-time system
暂无分享,去创建一个
Huaguang Zhang | Yinlei Wen | He Ren | Kun Zhang | Huaguang Zhang | Kun Zhang | He Ren | Yinlei Wen
[1] Huaguang Zhang,et al. General value iteration based reinforcement learning for solving optimal tracking control problem of continuous-time affine nonlinear systems , 2017, Neurocomputing.
[2] Yanhong Luo,et al. Data-driven optimal tracking control for a class of affine non-linear continuous-time systems with completely unknown dynamics , 2016 .
[3] Randal W. Beard,et al. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..
[4] Hamid Reza Karimi,et al. A Robust Observer-Based Sensor Fault-Tolerant Control for PMSM in Electric Vehicles , 2016, IEEE Transactions on Industrial Electronics.
[5] Huaguang Zhang,et al. Online optimal control of unknown discrete-time nonlinear systems by using time-based adaptive dynamic programming , 2015, Neurocomputing.
[6] Derong Liu,et al. Observer based adaptive dynamic programming for fault tolerant control of a class of nonlinear systems , 2017, Inf. Sci..
[7] Tingwen Huang,et al. Reinforcement learning solution for HJB equation arising in constrained optimal control problem , 2015, Neural Networks.
[8] Huaguang Zhang,et al. Event-Triggered-Based Distributed Cooperative Energy Management for Multienergy Systems , 2019, IEEE Transactions on Industrial Informatics.
[9] Huaguang Zhang,et al. A distributed Newton–Raphson-based coordination algorithm for multi-agent optimization with discrete-time communication , 2018, Neural Computing and Applications.
[10] Richard S. Sutton,et al. A Menu of Designs for Reinforcement Learning Over Time , 1995 .
[11] Rong Su,et al. Polynomial approach to optimal one-wafer cyclic scheduling of treelike hybrid multi-cluster tools via Petri nets , 2018, IEEE/CAA Journal of Automatica Sinica.
[12] Zhong-Ping Jiang,et al. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..
[13] Rongrong Wang,et al. Actuator and sensor faults estimation based on proportional integral observer for TS fuzzy model , 2017, J. Frankl. Inst..
[14] Frank L. Lewis,et al. Optimal Control , 1986 .
[15] Huaguang Zhang,et al. Near-Optimal Control for Nonzero-Sum Differential Games of Continuous-Time Nonlinear Systems Using Single-Network ADP , 2013, IEEE Transactions on Cybernetics.
[16] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[17] Haibo He,et al. GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[18] Frank L. Lewis,et al. H∞ control of linear discrete-time systems: Off-policy reinforcement learning , 2017, Autom..
[19] Derong Liu,et al. Integral Reinforcement Learning for Linear Continuous-Time Zero-Sum Games With Completely Unknown Dynamics , 2014, IEEE Transactions on Automation Science and Engineering.
[20] Derong Liu,et al. Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique , 2013, Neurocomputing.
[21] Hamid Reza Karimi,et al. A mixed 0-1 linear programming approach to the computation of all pure-strategy nash equilibria of a finite n -person game in normal form , 2014 .
[22] David G. Hull,et al. Optimal Control Theory for Applications , 2003 .
[23] Chaomin Luo,et al. Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms , 2017, IEEE Transactions on Cybernetics.
[24] Huaguang Zhang,et al. General value iteration based single network approach for constrained optimal controller design of partially-unknown continuous-time nonlinear systems , 2018, J. Frankl. Inst..
[25] Yu Liu,et al. Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming , 2017, IEEE/CAA Journal of Automatica Sinica.
[26] Derong Liu,et al. Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[27] Frank L. Lewis,et al. Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[28] Derong Liu,et al. Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[29] Qichao Zhang,et al. Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics , 2016, IEEE Transactions on Cybernetics.
[30] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[31] Huaguang Zhang,et al. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..
[32] Frank L. Lewis,et al. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..
[33] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..
[34] Haibo He,et al. An Event-Triggered ADP Control Approach for Continuous-Time System With Unknown Internal States , 2017, IEEE Transactions on Cybernetics.
[35] Behzad Moshiri,et al. Haar Wavelet-Based Approach for Optimal Control of Second-Order Linear Systems in Time Domain , 2005 .
[36] Huaguang Zhang,et al. Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.
[37] Derong Liu,et al. Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm , 2013, Neurocomputing.
[38] Huai‐Ning Wu,et al. Computationally efficient simultaneous policy update algorithm for nonlinear H∞ state feedback control with Galerkin's method , 2013 .
[39] Tingwen Huang,et al. Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.
[40] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[41] Kun Zhang,et al. Iterative adaptive dynamic programming methods with neural network implementation for multi-player zero-sum games , 2018, Neurocomputing.
[42] Derong Liu,et al. Decentralized guaranteed cost control of interconnected systems with uncertainties: A learning-based optimal control strategy , 2016, Neurocomputing.
[43] Tingwen Huang,et al. Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..
[44] Hamid Reza Karimi,et al. A computational method for solving optimal control and parameter estimation of linear systems using Haar wavelets , 2004, Int. J. Comput. Math..
[45] George G. Lendaris,et al. Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.
[46] Kun Zhang,et al. Value iteration based integral reinforcement learning approach for H∞ controller design of continuous-time nonlinear systems , 2018, Neurocomputing.
[47] Tingwen Huang,et al. Data-Driven $H_\infty$ Control for Nonlinear Distributed Parameter Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.