On Applications of Bootstrap in Continuous Space Reinforcement Learning
暂无分享,去创建一个
Ambuj Tewari | George Michailidis | Mohamad Kazem Shirani Faradonbeh | Ambuj Tewari | G. Michailidis
[1] Ambuj Tewari,et al. Finite Time Identification in Unstable Linear Systems , 2017, Autom..
[2] Zheng Wen,et al. New Insights into Bootstrapping for Bandits , 2018, ArXiv.
[3] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..
[4] Nikolai Matni,et al. On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.
[5] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[6] Karthikeyan Rajagopal,et al. Neural Network-Based Solutions for Stochastic Optimal Control Using Path Integrals , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[7] T. Lai,et al. Extended least squares and their applications to adaptive control and prediction in linear systems , 1986 .
[8] Nikolai Matni,et al. Safely Learning to Control the Constrained Linear Quadratic Regulator , 2018, 2019 American Control Conference (ACC).
[9] Alessandro Lazaric,et al. LQG for Portfolio Optimization , 2016, 1611.00997.
[10] Alexander Rakhlin,et al. How fast can linear dynamical systems be learned? , 2018, ArXiv.
[11] E. Mammen. The Bootstrap and Edgeworth Expansion , 1997 .
[12] Tor Lattimore,et al. Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits , 2018, ICML.
[13] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[14] P. Kumar,et al. Adaptive control with the stochastic approximation algorithm: Geometry and convergence , 1985 .
[15] A. Timmermann,et al. Small Sample Properties of Forecasts from Autoregressive Models Under Structural Breaks , 2003, SSRN Electronic Journal.
[16] Ambuj Tewari,et al. On Optimality of Adaptive Linear-Quadratic Regulators , 2018, ArXiv.
[17] T. Lai,et al. Asymptotic properties of general autoregressive models and strong consistency of least-squares estimates of their parameters , 1983 .
[18] Alessandro Lazaric,et al. Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems , 2018, ICML.
[19] Emanuel Todorov,et al. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems , 2004, ICINCO.
[20] Lei Guo,et al. Global Stability/Instability of LS-Based Discrete-Time Adaptive Nonlinear Control , 1996 .
[21] P. Hall,et al. Martingale Limit Theory and its Application. , 1984 .
[22] Jan Willem Polderman,et al. A note on the structure of two subsets of the parameter space in adaptive control problems , 1986 .
[23] Anders Lindquist,et al. On the Nonlinear Dynamics of Fast Filtering Algorithms , 1994 .
[24] Chaouki T. Abdallah,et al. Linear Quadratic Control: An Introduction , 2000 .
[25] Dean Eckles,et al. Thompson sampling with the online bootstrap , 2014, ArXiv.
[26] Lihong Li,et al. Sample Complexity Bounds of Exploration , 2012, Reinforcement Learning.
[27] Ambuj Tewari,et al. Optimism-Based Adaptive Regulation of Linear-Quadratic Systems , 2017, IEEE Transactions on Automatic Control.
[28] T. Lai,et al. Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .
[29] Mohamad Kazem Shirani Faradonbeh,et al. Finite Time Adaptive Stabilization of LQ Systems , 2018 .
[30] P. Hall,et al. Martingale Limit Theory and Its Application , 1980 .
[31] Martin J. Wainwright,et al. Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems , 2018, AISTATS.
[32] Jan Willem Polderman,et al. On the necessity of identifying the true parameter in adaptive LQ control , 1986 .
[33] Benjamin Van Roy,et al. Bootstrapped Thompson Sampling and Deep Exploration , 2015, ArXiv.
[34] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[35] Ambuj Tewari,et al. Input Perturbations for Adaptive Regulation and Learning , 2018, ArXiv.
[36] Craig Boutilier,et al. Data center cooling using model-predictive control , 2018, NeurIPS.
[37] Benjamin Recht,et al. Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.
[38] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[39] Ambuj Tewari,et al. Finite Time Adaptive Stabilization of LQ Systems , 2018, ArXiv.
[40] S. Bittanti,et al. ADAPTIVE CONTROL OF LINEAR TIME INVARIANT SYSTEMS: THE "BET ON THE BEST" PRINCIPLE ∗ , 2006 .
[41] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.
[42] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[43] D. McLeish. Dependent Central Limit Theorems and Invariance Principles , 1974 .
[44] B. M. Brown,et al. Martingale Central Limit Theorems , 1971 .
[45] Benjamin Recht,et al. Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.
[46] Alexander Rakhlin,et al. Near optimal finite time identification of arbitrary linear dynamical systems , 2018, ICML.
[47] Mohamad Kazem Shirani Faradonbeh,et al. Regret Analysis for Adaptive Linear-Quadratic Policies , 2017 .
[48] Csaba Szepesvári,et al. Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.
[49] Ambuj Tewari,et al. On adaptive Linear-Quadratic regulators , 2020, Autom..
[50] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[51] P. Kumar,et al. Adaptive Linear Quadratic Gaussian Control: The Cost-Biased Approach Revisited , 1998 .
[52] T. Lai. Asymptotically efficient adaptive control in stochastic regression models , 1986 .
[53] David Hinkley,et al. Bootstrap Methods: Another Look at the Jackknife , 2008 .