暂无分享,去创建一个
[1] John E. R. Staddon,et al. The dynamics of behavior: Review of Sutton and Barto: Reinforcement Learning : An Introduction (2 nd ed.) , 2020 .
[2] Kim-Chuan Toh,et al. Solving semidefinite-quadratic-linear programs using SDPT3 , 2003, Math. Program..
[3] Nikolai Matni,et al. On the Sample Complexity of the Linear Quadratic Regulator , 2017, Foundations of Computational Mathematics.
[4] I. Postlethwaite,et al. Linear Matrix Inequalities in Control , 2007 .
[5] Andrew Packard,et al. Robust H2 and H∞ filters for uncertain LFT systems , 2005, IEEE Trans. Autom. Control..
[6] Lennart Ljung,et al. Optimal experiment designs with respect to the intended model application , 1986, Autom..
[7] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[8] Pablo A. Parrilo,et al. A convex approach to robust H2 performance analysis , 2002, Autom..
[9] K. Åström,et al. Problems of Identification and Control , 1971 .
[10] Zhi-Quan Luo,et al. Multivariate Nonnegative Quadratic Mappings , 2003, SIAM J. Optim..
[11] Sham M. Kakade,et al. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.
[12] Benjamin Recht,et al. A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..
[13] Bo Wahlberg,et al. Application-Oriented Input Design in System Identification: Optimal Input Design for Control [Applications of Control] , 2017, IEEE Control Systems.
[14] Thomas B. Schön,et al. Learning Robust LQ-Controllers Using Application Oriented Exploration , 2020, IEEE Control Systems Letters.
[15] J. Lofberg,et al. YALMIP : a toolbox for modeling and optimization in MATLAB , 2004, 2004 IEEE International Conference on Robotics and Automation (IEEE Cat. No.04CH37508).
[16] Thomas B. Schön,et al. Robust exploration in linear quadratic reinforcement learning , 2019, NeurIPS.
[17] Avinatan Hassidim,et al. Online Linear Quadratic Control , 2018, ICML.
[18] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[19] Stephen P. Boyd,et al. Policies for simultaneous estimation and optimization , 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251).