暂无分享,去创建一个
Alexandre Laterre | Ian Davies | Cl'ement Bonnet | Paul Caron | Thomas Barrett | Clément Bonnet | Alexandre Laterre | Paul Caron | Ian Davies | Thomas D. Barrett
[1] Timothy M. Hospedales,et al. Online Meta-Critic Learning for Off-Policy Actor-Critic Methods , 2020, NeurIPS.
[2] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[3] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[4] Shimon Whiteson,et al. VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning , 2020, ICLR.
[5] Junhyuk Oh,et al. Discovering Reinforcement Learning Algorithms , 2020, NeurIPS.
[6] Louis Kirsch,et al. Improving Generalization in Meta Reinforcement Learning using Learned Objectives , 2020, ICLR.
[7] Junhyuk Oh,et al. A Self-Tuning Actor-Critic Algorithm , 2020, NeurIPS.
[8] David Silver,et al. Bootstrapped Meta-Learning , 2021, ArXiv.
[9] Richard L. Lewis,et al. Discovery of Useful Questions as Auxiliary Tasks , 2019, NeurIPS.
[10] Katja Hofmann,et al. Fast Context Adaptation via Meta-Learning , 2018, ICML.
[11] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[12] Joshua Achiam,et al. On First-Order Meta-Learning Algorithms , 2018, ArXiv.
[13] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[14] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[15] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[16] Tie-Yan Liu,et al. Beyond Exponentially Discounted Sum: Automatic Learning of Return Function , 2019, ArXiv.
[17] Yevgen Chebotar,et al. Meta Learning via Learned Loss , 2019, 2020 25th International Conference on Pattern Recognition (ICPR).
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Renjie Liao,et al. Understanding Short-Horizon Bias in Stochastic Meta-Optimization , 2018, ICLR.
[20] Jeremy Nixon,et al. Understanding and correcting pathologies in the training of learned optimizers , 2018, ICML.
[21] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[22] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[23] Ron Meir,et al. Discount Factor as a Regularizer in Reinforcement Learning , 2020, ICML.
[24] Junhyuk Oh,et al. Meta-Gradient Reinforcement Learning with an Objective Discovered Online , 2020, NeurIPS.