暂无分享,去创建一个
Sergey Levine | Chelsea Finn | Aurick Zhou | Kate Rakelly | Deirdre Quillen | S. Levine | Deirdre Quillen | Chelsea Finn | Aurick Zhou | Kate Rakelly
[1] Yoshua Bengio,et al. Learning a synaptic learning rule , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[2] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[3] Sebastian Thrun,et al. Learning to Learn , 1998, Springer US.
[4] J. Tenenbaum. A Bayesian framework for concept learning , 1999 .
[5] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[6] Pietro Perona,et al. A Bayesian approach to unsupervised one-shot learning of object categories , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[7] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[8] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[9] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[10] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[11] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.
[12] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[13] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[14] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[15] Finale Doshi-Velez,et al. Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations , 2013, IJCAI.
[16] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[17] Bartunov Sergey,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016 .
[18] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[19] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.
[20] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[21] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[22] Li Zhang,et al. Learning to Learn: Meta-Critic Networks for Sample Efficient Learning , 2017, ArXiv.
[23] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.
[24] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[25] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[26] Katja Hofmann,et al. Meta Reinforcement Learning with Latent Variable Gaussian Processes , 2018, UAI.
[27] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[28] Sergey Levine,et al. Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.
[29] Qiang Liu,et al. Learning to Explore via Meta-Policy Gradient , 2018, ICML.
[30] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.
[31] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[32] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[33] Andrew J. Davison,et al. Task-Embedded Control Networks for Few-Shot Imitation Learning , 2018, CoRL.
[34] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[35] Qiang Liu,et al. Learning to Explore with Meta-Policy Gradient , 2018, ICML 2018.
[36] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[37] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.
[38] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[39] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.
[40] Pieter Abbeel,et al. Evolved Policy Gradients , 2018, NeurIPS.
[41] Alexandre Lacoste,et al. TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.
[42] Razvan Pascanu,et al. Meta-Learning with Latent Embedding Optimization , 2018, ICLR.
[43] Sebastian Nowozin,et al. Meta-Learning Probabilistic Inference for Prediction , 2018, ICLR.
[44] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.
[45] Tamim Asfour,et al. ProMP: Proximal Meta-Policy Search , 2018, ICLR.