暂无分享,去创建一个
[1] Yishay Mansour,et al. Online Markov Decision Processes , 2009, Math. Oper. Res..
[2] Yishay Mansour,et al. Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret , 2019, ICML.
[3] Max Simchowitz,et al. Learning Linear Dynamical Systems with Semi-Parametric Least Squares , 2019, COLT.
[4] Ambuj Tewari,et al. Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret , 2012, ICML.
[5] Benjamin Recht,et al. Certainty Equivalence is Efficient for Linear Quadratic Control , 2019, NeurIPS.
[6] Kunal Talwar,et al. Online learning over a finite action set with limited switching , 2018, COLT.
[7] Elad Hazan,et al. Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.
[8] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[9] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .
[10] Avinatan Hassidim,et al. Online Linear Quadratic Control , 2018, ICML.
[11] Karan Singh,et al. Logarithmic Regret for Online Control , 2019, NeurIPS.
[12] Na Li,et al. Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis , 2019, NeurIPS.
[13] Karthik Sridharan,et al. Online Non-Parametric Regression , 2014, COLT.
[14] Nikolai Matni,et al. Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator , 2018, NeurIPS.
[15] Dante C. Youla,et al. Modern Wiener-Hopf Design of Optimal Controllers. Part I , 1976 .
[16] Sham M. Kakade,et al. The Nonstochastic Control Problem , 2020, ALT.
[17] Babak Hassibi,et al. Logarithmic Regret Bound in Partially Observable Linear Dynamical Systems , 2020, NeurIPS.
[18] Sham M. Kakade,et al. Online Control with Adversarial Disturbances , 2019, ICML.
[19] Csaba Szepesvári,et al. Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.
[20] Amin Karbasi,et al. Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition , 2020, NeurIPS.
[21] Babak Hassibi,et al. Regret Minimization in Partially Observable Linear Quadratic Control , 2020, ArXiv.
[22] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[23] Shie Mannor,et al. Online Learning for Adversaries with Memory: Price of Past Mistakes , 2015, NIPS.
[24] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[25] Max Simchowitz,et al. Logarithmic Regret for Adversarial Online Control , 2020, ICML.
[26] Varun Kanade,et al. Tracking Adversarial Targets , 2014, ICML.
[27] Elad Hazan,et al. Introduction to Online Convex Optimization , 2016, Found. Trends Optim..
[28] Max Simchowitz,et al. Naive Exploration is Optimal for Online LQR , 2020, ICML.
[29] Max Simchowitz,et al. Improper Learning for Non-Stochastic Control , 2020, COLT.
[30] Y. Halevi. Stable LQG controllers , 1994, IEEE Trans. Autom. Control..
[31] Alon Cohen,et al. Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently , 2020, ICML.
[32] Yurii Nesterov,et al. First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.
[33] Yuval Peres,et al. Bandits with switching costs: T2/3 regret , 2013, STOC.