暂无分享,去创建一个
[1] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.
[2] Jordi Torres,et al. Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills , 2020, ICML.
[3] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[4] Yishay Mansour,et al. Reinforcement Learning in POMDPs Without Resets , 2005, IJCAI.
[5] Sergey Levine,et al. Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning , 2017, ICLR.
[6] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[7] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[8] Sergey Levine,et al. Learning compound multi-step controllers under unknown dynamics , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[9] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[10] Yann LeCun,et al. Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic , 2019, ICLR.
[11] Doina Precup,et al. Options of Interest: Temporal Abstraction with Interest Functions , 2020, AAAI.
[12] George Tucker,et al. Conservative Q-Learning for Offline Reinforcement Learning , 2020, NeurIPS.
[13] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[14] Sergey Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[15] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[16] Pieter Abbeel,et al. Adaptive Online Planning for Continual Lifelong Learning , 2019, ArXiv.
[17] Pieter Abbeel,et al. Prediction and Control with Temporal Segment Models , 2017, ICML.
[18] Sergey Levine,et al. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? , 2019, ArXiv.
[19] Sergey Levine,et al. Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL , 2018, ICLR.
[20] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[21] Sergey Levine,et al. Deep Dynamics Models for Learning Dexterous Manipulation , 2019, CoRL.
[22] Yee Whye Teh,et al. Progress & Compress: A scalable framework for continual learning , 2018, ICML.
[23] Karol Hausman,et al. Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning , 2020, Robotics: Science and Systems.
[24] Martial Hebert,et al. Improving Multi-Step Prediction of Learned Time Series Models , 2015, AAAI.
[25] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.
[26] David Warde-Farley,et al. Unsupervised Control Through Non-Parametric Discriminative Rewards , 2018, ICLR.
[27] Evangelos Theodorou,et al. Model Predictive Path Integral Control using Covariance Variable Importance Sampling , 2015, ArXiv.
[28] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[29] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[30] Sergey Levine,et al. Ecological Reinforcement Learning , 2020, ArXiv.
[31] Sergey Levine,et al. Online Meta-Learning , 2019, ICML.
[32] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[33] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[34] Razvan Pascanu,et al. Progressive Neural Networks , 2016, ArXiv.
[35] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[36] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2020, ICML.
[37] Arthur Argenson,et al. Model-Based Offline Planning , 2021, ICLR 2021 Poster.
[38] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[39] Katja Hofmann,et al. The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors , 2019, ArXiv.
[40] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[41] Sergey Levine,et al. Uncertainty-Aware Reinforcement Learning for Collision Avoidance , 2017, ArXiv.
[42] OctoMiao. Overcoming catastrophic forgetting in neural networks , 2016 .
[43] J. A. Walker,et al. The general problem of the stability of motion , 1994 .
[44] David Rolnick,et al. Experience Replay for Continual Learning , 2018, NeurIPS.
[45] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.
[46] Sergey Levine,et al. Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.
[47] David Held,et al. Learning Off-Policy with Online Planning , 2020, CoRL.
[48] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[49] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[50] Justin Fu,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.