暂无分享,去创建一个
[1] Han-Fu Chen,et al. Convergence rates in stochastic adaptive tracking , 1989 .
[2] Ambuj Tewari,et al. On adaptive Linear-Quadratic regulators , 2020, Autom..
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[5] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[6] P. Kumar,et al. Adaptive Linear Quadratic Gaussian Control: The Cost-Biased Approach Revisited , 1998 .
[7] Sean P. Meyn,et al. Distributed Control Design for Balancing the Grid Using Flexible Loads , 2018 .
[8] Ambuj Tewari,et al. On Optimality of Adaptive Linear-Quadratic Regulators , 2018, ArXiv.
[9] Ambuj Tewari,et al. Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs , 2007, NIPS.
[10] T. Lai. Asymptotically efficient adaptive control in stochastic regression models , 1986 .
[11] James Lam,et al. Stabilization of Discrete-Time Nonlinear Uncertain Systems by Feedback Based on LS Algorithm , 2013, SIAM J. Control. Optim..
[12] Mohamad Kazem Shirani Faradonbeh,et al. Regret Analysis for Adaptive Linear-Quadratic Policies , 2017 .
[13] A. Timmermann,et al. Small Sample Properties of Forecasts from Autoregressive Models Under Structural Breaks , 2003, SSRN Electronic Journal.
[14] Adel Javanmard,et al. Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems , 2012, NIPS.
[15] Ambuj Tewari,et al. Optimality of Fast-Matching Algorithms for Random Networks With Applications to Structural Controllability , 2015, IEEE Transactions on Control of Network Systems.
[16] Csaba Szepesvári,et al. Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.
[17] Lei Guo,et al. Convergence and logarithm laws of self-tuning regulators , 1995, Autom..
[18] Ambuj Tewari,et al. Finite Time Analysis of Optimal Adaptive Policies for Linear-Quadratic Systems , 2017, ArXiv.
[19] Khashayar Khosravi,et al. Exploiting the Natural Exploration In Contextual Bandits , 2017, ArXiv.
[20] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..
[21] T. Lai,et al. Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .
[22] Craig Boutilier,et al. Data center cooling using model-predictive control , 2018, NeurIPS.
[23] Riccardo Marino,et al. Nonlinear control design: geometric, adaptive and robust , 1995 .
[24] S. Liberty,et al. Linear Systems , 2010, Scientific Parallel Computing.
[25] Jan Willem Polderman,et al. A note on the structure of two subsets of the parameter space in adaptive control problems , 1986 .
[26] T. Lai,et al. Extended least squares and their applications to adaptive control and prediction in linear systems , 1986 .
[27] Alessandro Lazaric,et al. Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems , 2018, ICML.
[28] Tamer Basar,et al. Optimal control of LTI systems over unreliable communication links , 2006, Autom..
[29] Sean P. Meyn. Control Techniques for Complex Networks: Workload , 2007 .
[30] Yi Ouyang,et al. Optimal Infinite Horizon Decentralized Networked Controllers With Unreliable Communication , 2018, IEEE Transactions on Automatic Control.
[31] Daphna Weinshall,et al. Online Learning in the Embedded Manifold of Low-rank Matrices , 2012, J. Mach. Learn. Res..
[32] S. Bittanti,et al. ADAPTIVE CONTROL OF LINEAR TIME INVARIANT SYSTEMS: THE "BET ON THE BEST" PRINCIPLE ∗ , 2006 .
[33] B. Bercu. Weighted estimation and tracking for ARMAX models , 1992, [1992] Proceedings of the 31st IEEE Conference on Decision and Control.
[34] Ambuj Tewari,et al. Finite Time Identification in Unstable Linear Systems , 2017, Autom..
[35] T. Söderström. Discrete-Time Stochastic Systems: Estimation and Control , 1995 .
[36] Mohamad Kazem Shirani Faradonbeh,et al. Finite Time Adaptive Stabilization of LQ Systems , 2018 .
[37] Ruth F. Curtain,et al. Linear-quadratic control: An introduction , 1997, Autom..
[38] Jan Willem Polderman,et al. On the necessity of identifying the true parameter in adaptive LQ control , 1986 .
[39] Khashayar Khosravi,et al. Mostly Exploration-Free Algorithms for Contextual Bandits , 2017, Manag. Sci..
[40] Sham M. Kakade,et al. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.
[41] Han-Fu Chen,et al. The AAstrom-Wittenmark self-tuning regulator revisited and ELS-based adaptive trackers , 1991 .
[42] T. Lai,et al. Parallel recursive algorithms in asymptotically efficient adaptive control of linear stochastic systems , 1991 .
[43] P. Kumar,et al. Convergence of adaptive control schemes using least-squares parameter estimates , 1990 .
[44] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.