Model-Free Optimal Tracking Control via Critic-Only Q-Learning
暂无分享,去创建一个
Tingwen Huang | Derong Liu | Biao Luo | Ding Wang | Derong Liu | Tingwen Huang | Ding Wang | Biao Luo
[1] Zhongke Shi,et al. Reinforcement Learning Output Feedback NN Control Using Deterministic Learning Technique , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[2] Yu Jiang,et al. Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[3] Xin Zhang,et al. Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.
[4] Frank L. Lewis,et al. Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data , 2015, IEEE Transactions on Cybernetics.
[5] Han-Xiong Li,et al. Data-based Suboptimal Neuro-control Design with Reinforcement Learning for Dissipative Spatially Distributed Processes , 2014 .
[6] Simon Haykin,et al. Neural Networks and Learning Machines , 2010 .
[7] Haibo He,et al. Optimal Control for Unknown Discrete-Time Nonlinear Markov Jump Systems Using Adaptive Dynamic Programming , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[8] Girish Chowdhary,et al. Off-policy reinforcement learning with Gaussian processes , 2014, IEEE/CAA Journal of Automatica Sinica.
[9] Sarangapani Jagannathan,et al. Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.
[10] Tingwen Huang,et al. Reinforcement learning solution for HJB equation arising in constrained optimal control problem , 2015, Neural Networks.
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] Frank L. Lewis,et al. Single Network Adaptive Critics Networks¿¿-¿¿Development, Analysis, and Applications , 2013 .
[13] Derong Liu,et al. Infinite Horizon Self-Learning Optimal Control of Nonaffine Discrete-Time Nonlinear Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[14] Derong Liu,et al. Numerical adaptive learning control scheme for discrete-time non-linear systems , 2013 .
[15] Chenguang Yang,et al. Global Neural Dynamic Surface Tracking Control of Strict-Feedback Systems With Application to Hypersonic Flight Vehicle , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[16] Keck Voon Ling,et al. Inverse optimal adaptive control for attitude tracking of spacecraft , 2005, IEEE Trans. Autom. Control..
[17] Derong Liu,et al. Adaptive Dynamic Programming for Optimal Tracking Control of Unknown Nonlinear Systems With Application to Coal Gasification , 2014, IEEE Transactions on Automation Science and Engineering.
[18] Tingwen Huang,et al. Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..
[19] J. Na,et al. Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuous-time unknown nonlinear systems , 2014, IEEE/CAA Journal of Automatica Sinica.
[20] Frank L. Lewis,et al. Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI , 2010, Autom..
[21] Derong Liu,et al. A Novel Dual Iterative $Q$-Learning Method for Optimal Battery Management in Smart Residential Environments , 2015, IEEE Transactions on Industrial Electronics.
[22] Radhakant Padhi,et al. A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems , 2006, Neural Networks.
[23] Shalabh Bhatnagar,et al. New algorithms of the Q-learning type , 2008, Autom..
[24] Frank L. Lewis,et al. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics , 2014, Autom..
[25] Pedro Ferreira,et al. An MDP Model-Based Reinforcement Learning Approach for Production Station Ramp-Up Optimization: Q-Learning Analysis , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[26] Shaocheng Tong,et al. Reinforcement Learning Design-Based Adaptive Tracking Control With Less Learning Parameters for Nonlinear Discrete-Time MIMO Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[27] Frank L. Lewis,et al. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .
[28] Zhongke Shi,et al. An overview on flight dynamics and control approaches for hypersonic vehicles , 2015, Science China Information Sciences.
[29] Huaguang Zhang,et al. Online optimal tracking control of continuous-time linear systems with unknown dynamics by using adaptive dynamic programming , 2014, Int. J. Control.
[30] Qinglai Wei,et al. Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..
[31] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[32] Han-Xiong Li,et al. Adaptive Optimal Control of Highly Dissipative Nonlinear Spatially Distributed Processes With Neuro-Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[33] Shalabh Bhatnagar,et al. Toward Off-Policy Learning Control with Function Approximation , 2010, ICML.
[34] Haibo He,et al. Online Learning Control Using Adaptive Critic Designs With Sparse Kernel Machines , 2013, IEEE Transactions on Neural Networks and Learning Systems.
[35] Huaguang Zhang,et al. Optimal Tracking Control for a Class of Nonlinear Discrete-Time Systems With Time Delays Based on Heuristic Dynamic Programming , 2011, IEEE Transactions on Neural Networks.
[36] Zhong-Ping Jiang,et al. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..
[37] Huaguang Zhang,et al. A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[38] Ali Heydari,et al. Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics , 2013, IEEE Transactions on Neural Networks and Learning Systems.
[39] Sivasubramanya N. Balakrishnan,et al. Optimal Tracking Control of Motion Systems , 2012, IEEE Transactions on Control Systems Technology.
[40] Dimitri P. Bertsekas,et al. Temporal Difference Methods for General Projected Equations , 2011, IEEE Transactions on Automatic Control.
[41] Jae Young Lee,et al. Integral Reinforcement Learning for Continuous-Time Input-Affine Nonlinear Systems With Simultaneous Invariant Explorations , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[42] Frank L. Lewis,et al. $ {H}_{ {\infty }}$ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[43] Jinyu Wen,et al. Adaptive Learning in Tracking Control Based on the Dual Critic Network Design , 2013, IEEE Transactions on Neural Networks and Learning Systems.
[44] Derong Liu,et al. Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Cybernetics.
[45] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[46] Tingwen Huang,et al. Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.
[47] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.
[48] Frank L. Lewis,et al. Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[49] Frank L. Lewis,et al. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..
[50] Haibo He,et al. Model-Free Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[51] Kurt Hornik,et al. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.
[52] Ali Heydari,et al. Optimal Switching and Control of Nonlinear Switching Systems Using Approximate Dynamic Programming , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[53] Frank L. Lewis,et al. Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems , 2015, IEEE Transactions on Cybernetics.
[54] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[55] Hao Xu,et al. Neural Network-Based Finite-Horizon Optimal Control of Uncertain Affine Nonlinear Discrete-Time Systems , 2015, IEEE Trans. Neural Networks Learn. Syst..
[56] Jae Young Lee,et al. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems , 2012, Autom..
[57] R. Sutton,et al. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .
[58] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[59] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.
[60] Tingwen Huang,et al. Data-Driven $H_\infty$ Control for Nonlinear Distributed Parameter Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[61] Derong Liu,et al. Data-Driven Neuro-Optimal Temperature Control of Water–Gas Shift Reaction Using Stable Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Industrial Electronics.
[62] Frank L. Lewis,et al. Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.
[63] Frank L. Lewis,et al. Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control , 2007, Autom..
[64] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..
[65] Frank L. Lewis,et al. Optimal control of nonlinear discrete time-varying systems using a new neural network approximation structure , 2015, Neurocomputing.
[66] Shuzhi Sam Ge,et al. Stable adaptive control and estimation for nonlinear systems - neural and fuzzy approximator techniques: J. T. Spooner, M. Maggiore, R. Ordóñez, K. M. Passino, John Wiley & Sons, Inc., New York, 2002, ISBN: 0-471-41546-4 , 2003, Autom..
[67] Derong Liu,et al. Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm , 2013, Neurocomputing.
[68] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[69] Haibo He,et al. Power System Stability Control for a Wind Farm Based on Adaptive Dynamic Programming , 2015, IEEE Transactions on Smart Grid.
[70] Derong Liu,et al. Neural-network-based adaptive optimal tracking control scheme for discrete-time nonlinear systems with approximation errors , 2013, Neurocomputing.
[71] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[72] Frank L. Lewis,et al. Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[73] Kevin M. Passino,et al. Stable Adaptive Control and Estimation for Nonlinear Systems , 2001 .
[74] Yuanqing Xia,et al. Adaptive attitude tracking control for rigid spacecraft with finite-time convergence , 2013, Autom..
[75] D. Bertsekas. Approximate policy iteration: a survey and some new methods , 2011 .
[76] Warren E. Dixon,et al. Concurrent learning-based approximate feedback-Nash equilibrium solution of N-player nonzero-sum differential games , 2013, IEEE/CAA Journal of Automatica Sinica.
[77] Warren E. Dixon,et al. Lyapunov-Based Exponential Tracking Control of a Hypersonic Aircraft with Aerothermoelastic Effects , 2010 .
[78] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[79] Qinglai Wei,et al. A Novel Iterative $\theta $-Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Automation Science and Engineering.
[80] Zhi-Hong Guan,et al. Optimal tracking and two-channel disturbance rejection under control energy constraint , 2011, Autom..
[81] Warren E. Dixon,et al. Approximate optimal trajectory tracking for continuous-time nonlinear systems , 2013, Autom..
[82] Huai-Ning Wu,et al. Neural Network Based Online Simultaneous Policy Update Algorithm for Solving the HJI Equation in Nonlinear $H_{\infty}$ Control , 2012, IEEE Transactions on Neural Networks and Learning Systems.
[83] Huai-Ning Wu,et al. Approximate Optimal Control Design for Nonlinear One-Dimensional Parabolic PDE Systems Using Empirical Eigenfunctions and Neural Network , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[84] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[85] Haibo He,et al. Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming , 2011, IEEE Transactions on Neural Networks.