Distral: Robust multitask reinforcement learning
暂无分享,去创建一个
Yee Whye Teh | Razvan Pascanu | Raia Hadsell | Wojciech Czarnecki | Nicolas Heess | Victor Bapst | James Kirkpatrick | John Quan | Wojciech M. Czarnecki | R. Hadsell | N. Heess | Y. Teh | V. Bapst | Razvan Pascanu | John Quan | J. Kirkpatrick
[1] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[2] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[3] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[4] Rich Caruana,et al. Model compression , 2006, KDD '06.
[5] Marc Toussaint,et al. Probabilistic inference for solving (PO) MDPs , 2006 .
[6] Peter Stone,et al. An Introduction to Intertask Transfer for Reinforcement Learning , 2011, AI Mag..
[7] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[8] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.
[9] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[10] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[11] N. Roy,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2013 .
[12] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[13] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.
[14] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[15] Razvan Pascanu,et al. Revisiting Natural Gradient for Deep Networks , 2013, ICLR.
[16] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[17] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[18] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[21] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[22] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[23] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[24] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[25] Roy Fox,et al. Principled Option Learning in Markov Decision Processes , 2016, ArXiv.
[26] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[27] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[28] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[29] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[30] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[31] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[32] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[33] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[34] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.