暂无分享,去创建一个
[1] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[2] David Q. Mayne,et al. Tube‐based robust nonlinear model predictive control , 2011 .
[3] James M. Rehg,et al. Aggressive driving with model predictive path integral control , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[5] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[6] Nolan Wagener,et al. An Online Learning Approach to Model Predictive Control , 2019, Robotics: Science and Systems.
[7] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[8] Yuval Tassa,et al. An integrated system for real-time model predictive control of humanoid robots , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).
[9] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[10] Dieter Fox,et al. BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators , 2019, Robotics: Science and Systems.
[11] Aravind Rajeswaran,et al. Lyceum: An efficient and scalable ecosystem for robot learning , 2020, L4DC.
[12] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[15] Nolan Wagener,et al. Information theoretic MPC for model-based reinforcement learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[16] Byron Boots,et al. Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning , 2018, ICLR.
[17] Yuval Tassa,et al. Value function approximation and model predictive control , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[18] Wojciech Jaskowski,et al. Model-Based Active Exploration , 2018, ICML.
[19] Byron Boots,et al. Information Theoretic Model Predictive Q-Learning , 2020, L4DC.
[20] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[21] Siddhartha S. Srinivasa,et al. Bayesian Residual Policy Optimization: : Scalable Bayesian Reinforcement Learning with Clairvoyant Experts , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[22] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[23] J. Andrew Bagnell,et al. Agnostic System Identification for Model-Based Reinforcement Learning , 2012, ICML.
[24] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[25] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[26] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[27] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[28] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[29] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[30] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..
[31] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.