暂无分享,去创建一个
Mohammad Norouzi | Jimmy Ba | Danijar Hafner | Timothy Lillicrap | Jimmy Ba | T. Lillicrap | Mohammad Norouzi | Danijar Hafner
[1] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[2] Sergey Levine,et al. SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning , 2018, ICML.
[3] Sergey Levine,et al. Stochastic Variational Video Prediction , 2017, ICLR.
[4] Daniel Guo,et al. Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.
[5] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[6] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[7] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[8] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[9] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[10] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Yann LeCun,et al. Model-Based Planning with Discrete and Continuous Actions , 2017 .
[12] Rob Fergus,et al. Stochastic Video Generation with a Learned Prior , 2018, ICML.
[13] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[14] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[15] Gregory Dudek,et al. Synthesizing Neural Network Controllers with Probabilistic Model-Based Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[16] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[17] David Amos,et al. Generative Temporal Models with Memory , 2017, ArXiv.
[18] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[19] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[20] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[21] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[22] Duy Nguyen-Tuong,et al. Probabilistic Recurrent State-Space Models , 2018, ICML.
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[26] Marc G. Bellemare,et al. The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning , 2017, ICLR.
[27] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[28] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[29] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[30] Uri Shalit,et al. Deep Kalman Filters , 2015, ArXiv.
[31] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[32] Maximilian Karl,et al. Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.
[33] Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.
[34] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[35] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[36] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[37] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[38] Fabien Moutarde,et al. Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field , 2019 .
[39] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[40] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[41] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[42] Sergey Levine,et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.
[43] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[44] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[45] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[46] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[47] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[48] Daan Wierstra,et al. Recurrent Environment Simulators , 2017, ICLR.
[49] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[50] Ilya Kostrikov,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ArXiv.
[51] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[52] Sebastian Risi,et al. Deep neuroevolution of recurrent and discrete world models , 2019, GECCO.
[53] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[54] C. Rasmussen,et al. Improving PILCO with Bayesian Neural Network Dynamics Models , 2016 .
[55] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[56] Lantao Yu,et al. MOPO: Model-based Offline Policy Optimization , 2020, NeurIPS.
[57] Martin A. Riedmiller,et al. Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models , 2019, CoRL.
[58] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[59] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[60] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.
[61] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[62] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[63] Marlos C. Machado,et al. On Bonus Based Exploration Methods In The Arcade Learning Environment , 2020, ICLR.
[64] Jürgen Schmidhuber,et al. World Models , 2018, ArXiv.
[65] Jimmy Ba,et al. Exploring Model-based Planning with Policy Networks , 2019, ICLR.
[66] Gabriel Kalweit,et al. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning , 2017, CoRL.
[67] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[68] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[69] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[70] Karol Gregor,et al. Temporal Difference Variational Auto-Encoder , 2018, ICLR.
[71] Sergey Levine,et al. Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.
[72] Pieter Abbeel,et al. Planning to Explore via Self-Supervised World Models , 2020, ICML.
[73] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[74] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[75] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[76] Carl E. Rasmussen,et al. PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos , 2019, ICML.
[77] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[78] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[79] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[80] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.