OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation
暂无分享,去创建一个
Jure Leskovec | Yuke Zhu | Anima Anandkumar | Animesh Garg | Hongyu Ren | J. Leskovec | Anima Anandkumar | Animesh Garg | Yuke Zhu | Hongyu Ren
[1] Shakir Mohamed,et al. Implicit Reparameterization Gradients , 2018, NeurIPS.
[2] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[3] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[4] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[6] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[7] Sebastian Thrun,et al. Learning to Learn: Introduction and Overview , 1998, Learning to Learn.
[8] Siyuan Li,et al. Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards , 2019, NeurIPS.
[9] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[10] Qiang Liu,et al. Learning to Explore via Meta-Policy Gradient , 2018, ICML.
[11] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[12] Yee Whye Teh,et al. Meta reinforcement learning as task inference , 2019, ArXiv.
[13] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[14] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[15] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[17] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[18] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[19] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[20] Yoshua Bengio,et al. Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[21] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[22] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[23] Alexander J. Smola,et al. Meta-Q-Learning , 2020, ICLR.
[24] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[25] Qiang Liu,et al. Learning to Explore with Meta-Policy Gradient , 2018, ICML 2018.
[26] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.
[27] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[28] Sergey Levine,et al. Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables , 2019, ICML.
[29] Tamim Asfour,et al. ProMP: Proximal Meta-Policy Search , 2018, ICLR.