Policy iteration based Q-learning for linear nonzero-sum quadratic differential games
暂无分享,去创建一个
Xinxing Li | Li Liang | Zhihong Peng | Wenzhong Zha | Zhihong Peng | Li Liang | Xinxing Li | Wenzhong Zha
[1] Zhong-Ping Jiang,et al. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..
[2] Derong Liu,et al. Error Bound Analysis of $Q$ -Function for Discounted Optimal Control Problems With Policy Iteration , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[3] Huai-Ning Wu,et al. Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control , 2017, IEEE Transactions on Cybernetics.
[4] Ruey-Wen Liu,et al. Construction of Suboptimal Control Sequences , 1967 .
[5] Frank L. Lewis,et al. Off-Policy Q-Learning: Set-Point Design for Optimizing Dual-Rate Rougher Flotation Operational Processes , 2018, IEEE Transactions on Industrial Electronics.
[6] Frank L. Lewis,et al. $ {H}_{ {\infty }}$ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[7] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[8] Derong Liu,et al. A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems , 2015, Science China Information Sciences.
[9] Frank L. Lewis,et al. Integral Reinforcement Learning for online computation of feedback Nash strategies of nonzero-sum differential games , 2010, 49th IEEE Conference on Decision and Control (CDC).
[10] Alessandro Astolfi,et al. Constructive $\epsilon$-Nash Equilibria for Nonzero-Sum Differential Games , 2015, IEEE Transactions on Automatic Control.
[11] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .
[12] Kyriakos G. Vamvoudakis,et al. Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach , 2017, Syst. Control. Lett..
[13] Dongbing Gu,et al. Construction of Barrier in a Fishing Game With Point Capture , 2017, IEEE Transactions on Cybernetics.
[14] Jacob Engwerda,et al. LQ Dynamic Optimization and Differential Games , 2005 .
[15] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[16] Marcus Johnson,et al. Approximate $N$ -Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[17] Frank L. Lewis,et al. Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[18] Qichao Zhang,et al. Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics , 2016, IEEE Transactions on Cybernetics.
[19] Tingwen Huang,et al. Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.
[20] Frank L. Lewis,et al. Multi-agent differential graphical games , 2011, Proceedings of the 30th Chinese Control Conference.
[21] Tzyh Jong Tarn,et al. Hybrid MDP based integrated hierarchical Q-learning , 2011, Science China Information Sciences.
[22] Hisham Abou-Kandil,et al. On global existence of solutions to coupled matrix Riccati equations in closed-loop Nash games , 1996, IEEE Trans. Autom. Control..
[23] Huaguang Zhang,et al. Near-Optimal Control for Nonzero-Sum Differential Games of Continuous-Time Nonlinear Systems Using Single-Network ADP , 2013, IEEE Transactions on Cybernetics.
[24] Frank L. Lewis,et al. Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems , 2015, IEEE Transactions on Cybernetics.
[25] Qian Liu,et al. Towards green for relay in InterPlaNetary Internet based on differential game model , 2014, Science China Information Sciences.
[26] Frank L. Lewis,et al. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..
[27] R. Godson. Elements of intelligence , 1979 .
[28] Chaomin Luo,et al. Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms , 2017, IEEE Transactions on Cybernetics.
[29] Richard B. Vinter,et al. Differential Games Controllers That Confine a System to a Safe Region in the State Space, With Applications to Surge Tank Control , 2012, IEEE Transactions on Automatic Control.
[30] Chaoxu Mu,et al. Developing nonlinear adaptive optimal regulators through an improved neural learning mechanism , 2016, Science China Information Sciences.
[31] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[32] Frank L. Lewis,et al. Adaptive dynamic programming for online solution of a zero-sum differential game , 2011 .
[33] Frank L. Lewis,et al. Neurodynamic Programming and Zero-Sum Games for Constrained Control Systems , 2008, IEEE Transactions on Neural Networks.
[34] Corrado Possieri,et al. An algebraic geometry approach for the computation of all linear feedback Nash equilibria in LQ differential games , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).
[35] Kyriakos G. Vamvoudakis,et al. Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems , 2015, Autom..
[36] Yao Zhao,et al. Cross-Modal Retrieval With CNN Visual Features: A New Baseline , 2017, IEEE Transactions on Cybernetics.
[37] Randal W. Bea. Successive Galerkin approximation algorithms for nonlinear optimal and robust control , 1998 .
[38] Frank L. Lewis,et al. H∞ control of linear discrete-time systems: Off-policy reinforcement learning , 2017, Autom..
[39] Frank L. Lewis,et al. Game Theory-Based Control System Algorithms with Real-Time Reinforcement Learning: How to Solve Multiplayer Games Online , 2017, IEEE Control Systems.
[40] Zongli Lin,et al. Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control , 2018, Autom..
[41] Frank L. Lewis,et al. Integral Reinforcement Learning for online computation of feedback Nash strategies of nonzero-sum differential games , 2010, CDC 2010.
[42] Tingwen Huang,et al. Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..
[43] L LewisFrank,et al. Multi-player non-zero-sum games , 2011 .
[44] D. Kleinman. On an iterative technique for Riccati equation computations , 1968 .
[45] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.
[46] T.-Y. Li,et al. Lyapunov Iterations for Solving Coupled Algebraic Riccati Equations of Nash Differential Games and Algebraic Riccati Equations of Zero-Sum Games , 1995 .
[47] Huaguang Zhang,et al. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..
[48] Derong Liu,et al. Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[49] G. Jank,et al. On Global Existence of Solutions to Coupled Matrix Riccati Equations in Closed Loop , 1996 .
[50] João P. Hespanha,et al. Cooperative Q-Learning for Rejection of Persistent Adversarial Inputs in Networked Linear Quadratic Systems , 2018, IEEE Transactions on Automatic Control.
[51] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[52] Frank L. Lewis,et al. Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..
[53] Frank L. Lewis,et al. Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis , 2017, IEEE Transactions on Cybernetics.
[54] Dongbin Zhao,et al. Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.