Optimism-driven exploration for nonlinear systems
暂无分享,去创建一个
Sergey Levine | Michael I. Jordan | Pieter Abbeel | Teodor Mihai Moldovan | S. Levine | P. Abbeel | T. Moldovan
[1] L. Lasdon,et al. Nonlinear Optimization by Successive Linear Programming , 1982 .
[2] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..
[3] Karl Johan Åström,et al. Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.
[4] R. Bodson,et al. Multivariable adaptive algorithms for reconfigurable flight control , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.
[5] Jay H. Lee,et al. Model predictive control: past, present and future , 1999 .
[6] Eduardo F. Camacho,et al. Introduction to Model Based Predictive Control , 1999 .
[7] Anil V. Rao,et al. Direct Trajectory Optimization and Costate Estimation via an Orthogonal Collocation Method , 2006 .
[8] Michael I. Jordan,et al. Variational inference for Dirichlet process mixtures , 2006 .
[9] Ambuj Tewari,et al. Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs , 2007, NIPS.
[10] Pieter Abbeel,et al. Autonomous Autorotation of an RC Helicopter , 2008, ISER.
[11] John T. Betts,et al. Practical Methods for Optimal Control and Estimation Using Nonlinear Programming , 2009 .
[12] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[13] Shimon Whiteson,et al. Neuroevolutionary reinforcement learning for generalized helicopter control , 2009, GECCO.
[14] Aude Billard,et al. BM: An iterative algorithm to learn stable non-linear dynamical systems with Gaussian mixture models , 2010, 2010 IEEE International Conference on Robotics and Automation.
[15] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[16] Tilo Strutz,et al. Data Fitting and Uncertainty: A practical introduction to weighted least squares and beyond , 2010 .
[17] Yee Whye Teh,et al. Dirichlet Process , 2017, Encyclopedia of Machine Learning and Data Mining.
[18] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[19] Csaba Szepesvári,et al. Regret Bounds for the Adaptive Control of Linear Quadratic Systems , 2011, COLT.
[20] Ian R. Manchester,et al. Stable dynamic walking over uneven terrain , 2011, Int. J. Robotics Res..
[21] Olivier Buffet,et al. Near-Optimal BRL using Optimistic Local Transitions , 2012, ICML.
[22] Yiannis Demiris,et al. A nonparametric Bayesian approach toward robot learning by demonstration , 2012, Robotics Auton. Syst..
[23] Claire J. Tomlin,et al. Extensions of learning-based model predictive control for real-time application to a quadrotor helicopter , 2012, 2012 American Control Conference (ACC).
[24] Scott Kuindersma,et al. Variational Bayesian Optimization for Runtime Risk-Sensitive Control , 2012, Robotics: Science and Systems.
[25] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[26] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[27] Andre Wibisono,et al. Streaming Variational Bayes , 2013, NIPS.
[28] Yuval Tassa,et al. An integrated system for real-time model predictive control of humanoid robots , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).
[29] Francesco Borrelli,et al. Solving linear and quadratic programs with an analog circuit , 2014, Comput. Chem. Eng..
[30] Carl E. Rasmussen,et al. Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.