暂无分享,去创建一个
[1] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[2] Yevgen Chebotar,et al. Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[3] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[4] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[5] Yuval Tassa,et al. Value function approximation and model predictive control , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[6] Nathan Michael,et al. Fast nonlinear model predictive control via partial enumeration , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[7] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[8] Sergey Levine,et al. One-shot learning of manipulation skills with online dynamics adaptation and neural network priors , 2015, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[9] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[10] Nolan Wagener,et al. Information theoretic MPC for model-based reinforcement learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[11] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[12] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[13] Nolan Wagener,et al. An Online Learning Approach to Model Predictive Control , 2019, Robotics: Science and Systems.
[14] Byron Boots,et al. Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning , 2018, ICLR.
[15] Sergey Levine,et al. Learning from the hindsight plan — Episodic MPC improvement , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[16] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[17] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[18] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[19] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[20] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[21] MODEL-ENSEMBLE TRUST-REGION POLICY OPTI- , 2017 .
[22] Yuval Tassa,et al. An integrated system for real-time model predictive control of humanoid robots , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).
[23] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[24] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[25] Yuval Tassa,et al. Real-time behaviour synthesis for dynamic hand-manipulation , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[26] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[27] Ashish Kapoor,et al. AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.
[28] Evangelos Theodorou,et al. Relative entropy and free energy dualities: Connections to Path Integral and KL control , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[29] Martial Hebert,et al. Improving Multi-Step Prediction of Learned Time Series Models , 2015, AAAI.
[30] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[31] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[32] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[33] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[34] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[35] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[36] Francesco Borrelli,et al. Learning Model Predictive Control for Iterative Tasks. A Data-Driven Control Framework , 2016, IEEE Transactions on Automatic Control.
[37] Wojciech Jaskowski,et al. Model-Based Active Exploration , 2018, ICML.
[38] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[39] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[40] J. Andrew Bagnell,et al. Agnostic System Identification for Model-Based Reinforcement Learning , 2012, ICML.
[41] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.