Linear Quadratic Reinforcement Learning: Sublinear Regret in the Episodic Continuous-Time Framework
暂无分享,去创建一个
Xin Guo | Anran Hu | Matteo Basei | Xin Guo | Anran Hu | Matteo Basei
[1] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[2] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[3] V. N. Bogaevski,et al. Matrix Perturbation Theory , 1991 .
[4] Nikolai Matni,et al. Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.
[5] Yann Ollivier,et al. Making Deep Q-learning methods robust to time discretization , 2019, ICML.
[6] Rémi Munos,et al. Reinforcement Learning for Continuous Stochastic Control Problems , 1997, NIPS.
[7] Benjamin Recht,et al. Certainty Equivalence is Efficient for Linear Quadratic Control , 2019, NeurIPS.
[8] Lei Guo,et al. Adaptive continuous-time linear quadratic Gaussian control , 1999, IEEE Trans. Autom. Control..
[9] X. Zhou,et al. Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .
[10] Benjamin Van Roy,et al. Model-based Reinforcement Learning and the Eluder Dimension , 2014, NIPS.
[11] Petr Mandl,et al. On least squares estimation in continuous time linear stochastic systems , 1992, Kybernetika.
[12] G. M.,et al. Partial Differential Equations I , 2023, Applied Mathematical Sciences.
[13] Yishay Mansour,et al. Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret , 2019, ArXiv.
[14] Zongli Lin,et al. Output Feedback Reinforcement Learning Control for the Continuous-Time Linear Quadratic Regulator Problem , 2018, 2018 Annual American Control Conference (ACC).
[15] Rémi Munos,et al. A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions , 2000, Machine Learning.
[16] Zhengtao Ding. Adaptive control of linear systems , 2013 .
[17] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .
[18] Mark Veraar,et al. The stochastic Fubini theorem revisited , 2012 .
[19] Peter Auer,et al. Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.
[20] Martin J. Wainwright,et al. High-Dimensional Statistics , 2019 .
[21] Yishay Mansour,et al. Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret , 2019, ICML.
[22] H. Soner,et al. Small time path behavior of double stochastic integrals and applications to stochastic control , 2005, math/0602453.
[23] Csaba Szepesvári,et al. Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.
[24] Rémi Munos,et al. Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..
[25] Claude-Nicolas Fiechter,et al. PAC adaptive control of linear systems , 1997, COLT '97.
[26] Frank L. Lewis,et al. Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.
[27] Benjamin Recht,et al. Certainty Equivalent Control of LQR is Efficient , 2019, ArXiv.
[28] Sham M. Kakade,et al. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.