暂无分享,去创建一个
Daniel Guo | Bilal Piot | Mohammad Gheshlaghi Azar | R'emi Munos | Bernardo Avila Pires | Florent Altch'e | Jean-bastien Grill | R. Munos | Bilal Piot | M. G. Azar | Z. Guo | Jean-Bastien Grill | Florent Altché | B. '. Pires
[1] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[2] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[3] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[4] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[7] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[8] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[9] Joelle Pineau,et al. Combined Reinforcement Learning via Abstract Representations , 2018, AAAI.
[10] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[11] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[12] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[14] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[15] Tamim Asfour,et al. ProMP: Proximal Meta-Policy Search , 2018, ICLR.
[16] Rémi Munos,et al. World Discovery Models , 2019, ArXiv.
[17] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[18] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[19] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[20] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[21] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[24] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[25] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[26] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[27] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[28] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[29] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[30] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[31] Lihong Li,et al. Sample Complexity of Multi-task Reinforcement Learning , 2013, UAI.
[32] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[33] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[34] Daniel Guo,et al. Never Give Up: Learning Directed Exploration Strategies , 2020, ICLR.
[35] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[36] Aaron van den Oord,et al. Shaping Belief States with Generative Environment Models for RL , 2019, NeurIPS.
[37] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[38] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[39] Sergey Levine,et al. SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning , 2018, ICML.
[40] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[41] Rémi Munos,et al. Neural Predictive Belief Representations , 2018, ArXiv.
[42] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[43] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[44] Misha Denil,et al. Learning Awareness Models , 2018, ICLR.
[45] H. Francis Song,et al. V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control , 2019, ICLR.
[46] N. Heess,et al. Neural belief states for partially observed domains , 2018 .