Policy iterations for reinforcement learning problems in continuous time and space - Fundamental theory and methods
暂无分享,去创建一个
[1] Yann Ollivier,et al. Making Deep Q-learning methods robust to time discretization , 2019, ICML.
[2] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.
[3] Jae Young Lee,et al. Policy Iteration for Discounted Reinforcement Learning Problems in Continuous Time and Space , 2017 .
[4] Derong Liu,et al. Data-based robust optimal control of continuous-time affine nonlinear systems with matched uncertainties , 2016, Inf. Sci..
[5] Frank L. Lewis,et al. Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.
[6] Frank L. Lewis,et al. H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning , 2016, Unmanned Syst..
[7] Vladimir Gaitsgory,et al. Stabilization with discounted optimal control , 2015, Syst. Control. Lett..
[8] Jae Young Lee,et al. Integral Reinforcement Learning for Continuous-Time Input-Affine Nonlinear Systems With Simultaneous Invariant Explorations , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[9] Frank L. Lewis,et al. Adaptive Suboptimal Output-Feedback Control for Linear Systems Using Integral Reinforcement Learning , 2015, IEEE Transactions on Control Systems Technology.
[10] Zhong-Ping Jiang,et al. Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..
[11] Frank L. Lewis,et al. Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.
[12] Jae Young Lee,et al. On integral generalized policy iteration for continuous-time linear quadratic regulations , 2014, Autom..
[13] Wulfram Gerstner,et al. Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons , 2013, PLoS Comput. Biol..
[14] P. Olver. Nonlinear Systems , 2013 .
[15] Jae Young Lee,et al. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems , 2012, Autom..
[16] Warren B. Powell,et al. “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.
[17] Sean P. Meyn,et al. Q-learning and Pontryagin's Minimum Principle , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[18] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.
[19] Frank L. Lewis,et al. 2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .
[20] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..
[21] Frank L. Lewis,et al. Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..
[22] W. Haddad,et al. Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach , 2008 .
[23] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[25] J. Murray,et al. The Adaptive Dynamic Programming Theorem , 2003 .
[26] George G. Lendaris,et al. Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.
[27] W. A. Kirk,et al. Handbook of metric fixed point theory , 2001 .
[28] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[29] Randal W. Beard,et al. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..
[30] J. Doyle,et al. Essentials of Robust Control , 1997 .
[31] S. Lyashevskiy. Constrained optimization and control of nonlinear systems: new results in optimal control , 1996, Proceedings of 35th IEEE Conference on Decision and Control.
[32] R. Sundaram. A First Course in Optimization Theory , 1996 .
[33] Chi-Tsong Chen,et al. Linear System Theory and Design , 1995 .
[34] Leiba Rodman,et al. Algebraic Riccati equations , 1995 .
[35] V. Mehrmann. The Autonomous Linear Quadratic Control Problem: Theory and Numerical Solution , 1991 .
[36] A. Bruckner,et al. Elementary Real Analysis , 1991 .
[37] B. Anderson,et al. Optimal control: linear quadratic methods , 1990 .
[38] Gerald B. Folland,et al. Real Analysis: Modern Techniques and Their Applications , 1984 .
[39] W. Arnold,et al. Numerical Solution of Algebraic Matrix Riccati Equations. , 1984 .
[40] George N. Saridis,et al. An Approximation Theory of Optimal Control for Trainable Manipulators , 1979, IEEE Transactions on Systems, Man, and Cybernetics.
[41] D. Kleinman. On an iterative technique for Riccati equation computations , 1968 .
[42] Ruey-Wen Liu,et al. Construction of Suboptimal Control Sequences , 1967 .
[43] Z. Rekasius,et al. Suboptimal design of intentionally nonlinear controllers , 1964 .
[44] W. Rudin. Principles of mathematical analysis , 1964 .
[45] R. Howard. Dynamic Programming and Markov Processes , 1960 .
[46] C. Bessaga. On the converse of Banach "fixed-point principle" , 1959 .
[47] S. Banach. Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales , 1922 .
[48] T. H. Gronwall. Note on the Derivatives with Respect to a Parameter of the Solutions of a System of Differential Equations , 1919 .
[49] L. Brouwer. Beweis der Invarianz desn-dimensionalen Gebiets , 1911 .