Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis
暂无分享,去创建一个
Frank L. Lewis | Qiuye Sun | Qinglai Wei | Ruizhuo Song | Pengfei Yan | F. Lewis | Q. Wei | Qiuye Sun | Pengfei Yan | Ruizhuo Song | Qinglai Wei
[1] Kyriakos G. Vamvoudakis,et al. Asymptotically Stable Adaptive–Optimal Control Algorithm With Saturating Actuators and Relaxed Persistence of Excitation , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[2] Frank L. Lewis,et al. Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI , 2010, Autom..
[3] Huaguang Zhang,et al. Adaptive Dynamic Programming for a Class of Complex-Valued Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[4] Dimitri P. Bertsekas,et al. Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[5] Huaguang Zhang,et al. Near-Optimal Control for Nonzero-Sum Differential Games of Continuous-Time Nonlinear Systems Using Single-Network ADP , 2013, IEEE Transactions on Cybernetics.
[6] Habib Rajabi Mashhadi,et al. An Adaptive $Q$-Learning Algorithm Developed for Agent-Based Computational Modeling of Electricity Market , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[7] Zhong-Ping Jiang,et al. Global Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems , 2013, IEEE Transactions on Automatic Control.
[8] Jay H. Lee,et al. Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes , 2005, Autom..
[9] Shalabh Bhatnagar,et al. Reinforcement Learning With Function Approximation for Traffic Signal Control , 2011, IEEE Transactions on Intelligent Transportation Systems.
[10] Xiangnan Zhong,et al. An Event-Triggered ADP Control Approach for Continuous-Time System With Unknown Internal States. , 2017 .
[11] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[12] Derong Liu,et al. Data-Driven Neuro-Optimal Temperature Control of Water–Gas Shift Reaction Using Stable Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Industrial Electronics.
[13] Frank L. Lewis,et al. Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.
[14] Frank L. Lewis,et al. Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control , 2007, Autom..
[15] Bo Lincoln,et al. Relaxing dynamic programming , 2006, IEEE Transactions on Automatic Control.
[16] Haibo He,et al. A Novel Energy Function-Based Stability Evaluation and Nonlinear Control Approach for Energy Internet , 2017, IEEE Transactions on Smart Grid.
[17] Jianwei Zhang,et al. A Survey on CPG-Inspired Control Models and System Implementation , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[18] Haibo He,et al. Goal Representation Heuristic Dynamic Programming on Maze Navigation , 2013, IEEE Transactions on Neural Networks and Learning Systems.
[19] Derong Liu,et al. Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Cybernetics.
[20] Ali Heydari,et al. Revisiting Approximate Dynamic Programming and its Convergence , 2014, IEEE Transactions on Cybernetics.
[21] Frank L. Lewis,et al. Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems , 2014, Autom..
[22] Derong Liu,et al. Adaptive Dynamic Programming for Optimal Tracking Control of Unknown Nonlinear Systems With Application to Coal Gasification , 2014, IEEE Transactions on Automation Science and Engineering.
[23] Li Ren,et al. A Multiagent Q-Learning-Based Optimal Allocation Approach for Urban Water Resource Management System , 2014, IEEE Transactions on Automation Science and Engineering.
[24] Shaocheng Tong,et al. A Unified Approach to Adaptive Neural Control for Nonlinear Discrete-Time Systems With Nonlinear Dead-Zone Input , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[25] Derong Liu,et al. A self-learning scheme for residential energy system control and management , 2013, Neural Computing and Applications.
[26] Frank L. Lewis,et al. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..
[27] Frank L. Lewis,et al. Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances , 2016, IEEE Transactions on Cybernetics.
[28] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[29] Sarangapani Jagannathan,et al. Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.
[30] Richard S. Sutton,et al. A Menu of Designs for Reinforcement Learning Over Time , 1995 .
[31] Haibo He,et al. GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[32] Derong Liu,et al. Infinite Horizon Self-Learning Optimal Control of Nonaffine Discrete-Time Nonlinear Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[33] Huaguang Zhang,et al. Distributed Cooperative Optimal Control for Multiagent Systems on Directed Graphs: An Inverse Optimal Approach , 2015, IEEE Transactions on Cybernetics.
[34] F. Lewis,et al. Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.
[35] Amit Konar,et al. A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[36] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[37] Derong Liu,et al. Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.
[38] Hao Xu,et al. Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses , 2012, Autom..
[39] Shaocheng Tong,et al. Reinforcement Learning Design-Based Adaptive Tracking Control With Less Learning Parameters for Nonlinear Discrete-Time MIMO Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[40] Josep M. Guerrero,et al. Hybrid Three-Phase/Single-Phase Microgrid Architecture With Power Management Capabilities , 2015, IEEE Transactions on Power Electronics.
[41] Qinglai Wei,et al. Data-Driven Zero-Sum Neuro-Optimal Control for a Class of Continuous-Time Unknown Nonlinear Systems With Disturbance Using ADP , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[42] Derong Liu,et al. A Novel Dual Iterative $Q$-Learning Method for Optimal Battery Management in Smart Residential Environments , 2015, IEEE Transactions on Industrial Electronics.
[43] Qinglai Wei,et al. A Novel Iterative $\theta $-Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Automation Science and Engineering.
[44] Derong Liu,et al. Model-Free Adaptive Dynamic Programming for Optimal Control of Discrete-Time Ane Nonlinear System , 2014 .
[45] Huaguang Zhang,et al. A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[46] Paul J. Webros. A menu of designs for reinforcement learning over time , 1990 .
[47] Shaocheng Tong,et al. Adaptive NN Tracking Control of Uncertain Nonlinear Discrete-Time Systems With Nonaffine Dead-Zone Input , 2015, IEEE Transactions on Cybernetics.
[48] Hao Xu,et al. Finite-horizon near optimal adaptive control of uncertain linear discrete-time systems , 2015 .
[49] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[50] Derong Liu,et al. Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[51] Frank L. Lewis,et al. Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.
[52] Derong Liu,et al. Multibattery Optimal Coordination Control for Home Energy Management Systems via Distributed Iterative Adaptive Dynamic Programming , 2015, IEEE Transactions on Industrial Electronics.
[53] Josep M. Guerrero,et al. A Multiagent-Based Consensus Algorithm for Distributed Coordinated Control of Distributed Generators in the Energy Internet , 2015, IEEE Transactions on Smart Grid.
[54] H. Vincent Poor,et al. QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations , 2012, IEEE Trans. Signal Process..
[55] Frank L. Lewis,et al. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..
[56] Frank L. Lewis,et al. Stochastic Optimal Design for Unknown Linear Discrete‐Time System Zero‐Sum Games in Input‐Output form Under Communication Constraints , 2014 .
[57] Zhong-Ping Jiang,et al. Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..
[58] Frank L. Lewis,et al. Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.