A Workflow for Offline Model-Free Robotic Reinforcement Learning
暂无分享,去创建一个
Sergey Levine | Chelsea Finn | Aviral Kumar | Stephen Tian | Anikait Singh | S. Levine | Chelsea Finn | Aviral Kumar | Anika Singh | Stephen Tian
[1] Jan Peters,et al. Learning robot in-hand manipulation with tactile features , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).
[2] Ilya Kostrikov,et al. Offline Reinforcement Learning with Fisher Divergence Critic Regularization , 2021, ICML.
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Chelsea Finn,et al. Model-Based Visual Planning with Self-Supervised Functional Distances , 2020, ICLR.
[5] Philip S. Thomas,et al. High-Confidence Off-Policy Evaluation , 2015, AAAI.
[6] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[7] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[8] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[9] Marc G. Bellemare,et al. Representations for Stable Off-Policy Reinforcement Learning , 2020, ICML.
[10] S. Levine,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[11] Natasha Jaques,et al. Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog , 2019, ArXiv.
[12] S. Levine,et al. Accelerating Online Reinforcement Learning with Offline Datasets , 2020, ArXiv.
[13] Scott Fujimoto,et al. A Minimalist Approach to Offline Reinforcement Learning , 2021, NeurIPS.
[14] Stefano Soatto,et al. Emergence of Invariance and Disentanglement in Deep Representations , 2017, 2018 Information Theory and Applications Workshop (ITA).
[15] Sergey Levine,et al. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning , 2019, ArXiv.
[16] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[17] Sergey Levine,et al. Improvisation through Physical Understanding: Using Novel Objects as Tools with Visual Foresight , 2019, Robotics: Science and Systems.
[18] Gaurav S. Sukhatme,et al. Efficient Adaptation for End-to-End Vision-Based Robotic Manipulation , 2020, RSS 2020.
[19] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[20] Zhen Xu,et al. NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning , 2021, NeurIPS.
[21] Hoang Minh Le,et al. Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning , 2019, NeurIPS Datasets and Benchmarks.
[22] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[23] Byron Boots,et al. IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[24] Ilya Kostrikov,et al. Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation , 2020, ArXiv.
[25] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[26] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[27] Sergey Levine,et al. Collective robot reinforcement learning with distributed asynchronous guided policy search , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[28] Rémi Munos,et al. Error Bounds for Approximate Value Iteration , 2005, AAAI.
[29] Sergey Levine,et al. COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning , 2020, ArXiv.
[30] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[31] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[32] Alberto Rodriguez,et al. Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[33] Edward Johns,et al. Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[34] Sergey Levine,et al. Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills , 2021, ICML.
[35] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[36] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[37] Silvio Savarese,et al. Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations , 2020, Robotics: Science and Systems.
[38] Nando de Freitas,et al. Hyperparameter Selection for Offline Reinforcement Learning , 2020, ArXiv.
[39] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[40] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[41] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[42] Alex Lascarides,et al. Interpretable Latent Spaces for Learning from Demonstration , 2018, CoRL.
[43] Sergey Levine,et al. Learning Dexterous Manipulation Policies from Experience and Imitation , 2016, ArXiv.
[44] Chelsea Finn,et al. Offline Reinforcement Learning from Images with Latent Space Models , 2020, L4DC.
[45] Alexander J. Smola,et al. Continuous Doubly Constrained Batch Reinforcement Learning , 2021, NeurIPS.
[46] Sergey Levine,et al. What Can I Do Here? Learning New Skills by Imagining Visual Affordances , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[47] Philip S. Thomas,et al. High Confidence Policy Improvement , 2015, ICML.
[48] Andrew J. Davison,et al. Sim-to-Real Reinforcement Learning for Deformable Object Manipulation , 2018, CoRL.
[49] Sergey Levine,et al. Benchmarks for Deep Off-Policy Evaluation , 2021, ICLR.
[50] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[51] Bo Dai,et al. DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections , 2019, NeurIPS.
[52] Thorsten Joachims,et al. MOReL : Model-Based Offline Reinforcement Learning , 2020, NeurIPS.
[53] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[54] Connor Schenck,et al. Visual closed-loop control for pouring liquids , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[55] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[56] Sergey Levine,et al. Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning , 2020, ICLR.
[57] Sergey Levine,et al. DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization , 2021, ICLR.
[58] Sergey Levine,et al. Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.