Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret
暂无分享,去创建一个
[1] Sanjeev Arora,et al. Towards Provable Control for Unknown Linear Dynamical Systems , 2018, International Conference on Learning Representations.
[2] Nikolai Matni,et al. Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.
[3] Peter Whittle,et al. Optimal Control: Basics and Beyond , 1996 .
[4] John T. Workman,et al. Optimal Control Applied to Biological Models , 2007 .
[5] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[6] Martin J. Wainwright,et al. Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems , 2018, AISTATS.
[7] Yishay Mansour,et al. Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret , 2019, ArXiv.
[8] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[9] Aurea Martínez,et al. A state constrained optimal control problem related to the sterilization of canned foods , 1994, Autom..
[10] Nikolai Matni,et al. On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.
[11] Nevena Lazic,et al. Regret Bounds for Model-Free Linear Quadratic Control , 2018, ArXiv.
[12] Avinatan Hassidim,et al. Online Linear Quadratic Control , 2018, ICML.
[13] Hans P. Geering,et al. Optimal control with engineering applications , 2007 .
[14] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[15] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[16] Yi Ouyang,et al. Learning-based Control of Unknown Linear Systems with Thompson Sampling , 2017, ArXiv.
[17] Michael I. Jordan,et al. Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification , 2018, COLT.
[18] Alessandro Lazaric,et al. Thompson Sampling for Linear-Quadratic Control Problems , 2017, AISTATS.
[19] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[20] F. T. Wright. A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables Whose Distributions are not Necessarily Symmetric , 1973 .
[21] Kazuoki Azuma. WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .
[22] Ambuj Tewari,et al. Finite Time Analysis of Optimal Adaptive Policies for Linear-Quadratic Systems , 2017, ArXiv.
[23] Benjamin Recht,et al. Certainty Equivalent Control of LQR is Efficient , 2019, ArXiv.
[24] D. Hearn,et al. OPTIMAL CONTROL MODELS IN FINANCE , 2013 .
[25] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[26] Sham M. Kakade,et al. A tail inequality for quadratic forms of subgaussian random vectors , 2011, ArXiv.
[27] Adel Javanmard,et al. Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems , 2012, NIPS.
[28] Csaba Szepesvári,et al. Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.
[29] Alessandro Lazaric,et al. Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems , 2018, ICML.
[30] Sham M. Kakade,et al. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.
[31] F. T. Wright,et al. A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables , 1971 .