A teacher-student framework to distill future trajectories
暂无分享,去创建一个
Bernhard Schölkopf | Giambattista Parascandolo | Alexander Neitz | B. Schölkopf | Giambattista Parascandolo | Alexander Neitz
[1] Richard L. Lewis,et al. Discovery of Useful Questions as Auxiliary Tasks , 2019, NeurIPS.
[2] Vladimir Vapnik,et al. A new learning paradigm: Learning using privileged information , 2009, Neural Networks.
[3] Alexei A. Efros,et al. Time-Agnostic Prediction: Predicting Predictable Video Frames , 2018, ICLR.
[4] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[5] David Silver,et al. Meta-Gradient Reinforcement Learning , 2018, NeurIPS.
[6] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[7] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[8] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[9] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[10] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[11] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[12] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[13] Bernhard Schölkopf,et al. Unifying distillation and privileged information , 2015, ICLR.
[14] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[15] Joel Lehman,et al. Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data , 2019, ICML.
[16] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[17] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[18] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[19] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[20] Sergey Levine,et al. Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.
[21] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[22] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[25] Doina Precup,et al. Value-driven Hindsight Modelling , 2020, NeurIPS.
[26] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[27] Artem Molchanov,et al. Generalized Inner Loop Meta-Learning , 2019, ArXiv.
[28] Tom Eccles,et al. An investigation of model-free planning , 2019, ICML.
[29] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.
[30] Jascha Sohl-Dickstein,et al. Learning Unsupervised Learning Rules , 2018, ArXiv.
[31] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[32] Barak A. Pearlmutter,et al. Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..
[33] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[34] Stefan Bauer,et al. Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models , 2018, NeurIPS.
[35] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.