Offline Meta-Reinforcement Learning with Advantage Weighting
暂无分享,去创建一个
Sergey Levine | Chelsea Finn | Eric Mitchell | Xue Bin Peng | Rafael Rafailov | S. Levine | Chelsea Finn | X. B. Peng | Rafael Rafailov | E. Mitchell
[1] Sergey Levine,et al. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning , 2019, ArXiv.
[2] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[3] Aviv Tamar,et al. Offline Meta Reinforcement Learning , 2020, ArXiv.
[4] Yee Whye Teh,et al. Meta reinforcement learning as task inference , 2019, ArXiv.
[5] Renjie Liao,et al. Understanding Short-Horizon Bias in Stochastic Meta-Optimization , 2018, ICLR.
[6] Fei Sha,et al. When MAML Can Adapt Fast and How to Assist When It Cannot , 2021, AISTATS.
[7] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2020, ICML.
[8] Yevgen Chebotar,et al. Meta Learning via Learned Loss , 2019, 2020 25th International Conference on Pattern Recognition (ICPR).
[9] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[10] Artem Molchanov,et al. Generalized Inner Loop Meta-Learning , 2019, ArXiv.
[11] Yoshua Bengio,et al. Torchmeta: A Meta-Learning library for PyTorch , 2019, ArXiv.
[12] Pieter Abbeel,et al. Meta-Learning with Temporal Convolutions , 2017, ArXiv.
[13] Sergey Levine,et al. Meta-Learning with Implicit Gradients , 2019, NeurIPS.
[14] Katja Hofmann,et al. Fast Context Adaptation via Meta-Learning , 2018, ICML.
[15] Stefan Schaal,et al. Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.
[16] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[17] S. Levine,et al. Guided Meta-Policy Search , 2019, NeurIPS.
[18] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[19] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[22] Chelsea Finn,et al. Learning to Learn with Gradients , 2018 .
[23] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[24] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.
[25] Sebastian Thrun,et al. Learning to Learn: Introduction and Overview , 1998, Learning to Learn.
[26] Louis Kirsch,et al. Improving Generalization in Meta Reinforcement Learning using Learned Objectives , 2020, ICLR.
[27] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[28] Sergey Levine,et al. One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.
[29] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[30] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[31] Atil Iscen,et al. NoRML: No-Reward Meta Learning , 2019, AAMAS.
[32] Natasha Jaques,et al. Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog , 2019, ArXiv.
[33] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[34] S. Levine,et al. Accelerating Online Reinforcement Learning with Offline Datasets , 2020, ArXiv.
[35] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[36] Louis Kirsch,et al. Improving Generalization in Meta Reinforcement Learning using Neural Objectives , 2020, ICLR 2020.
[37] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[38] Yoshua Bengio,et al. On the Optimization of a Synaptic Learning Rule , 2007 .
[39] Ricardo Luna Gutierrez,et al. Information-theoretic Task Selection for Meta-Reinforcement Learning , 2020, Neural Information Processing Systems.
[40] Aviv Tamar,et al. Offline Meta Learning of Exploration , 2020 .
[41] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[42] Tamim Asfour,et al. ProMP: Proximal Meta-Policy Search , 2018, ICLR.
[43] Sergey Levine,et al. Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm , 2017, ICLR.
[44] Katja Hofmann,et al. Meta Reinforcement Learning with Latent Variable Gaussian Processes , 2018, UAI.
[45] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[46] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.
[47] Alexander J. Smola,et al. Meta-Q-Learning , 2020, ICLR.
[48] Razvan Pascanu,et al. Meta-Learning with Latent Embedding Optimization , 2018, ICLR.
[49] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[50] Shimon Whiteson,et al. VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning , 2020, ICLR.