暂无分享,去创建一个
Fabio Viola | Demis Hassabis | Daan Wierstra | Karol Gregor | Lars Buesing | Sébastien Racanière | Theophane Weber | Frederic Besse | S. M. Ali Eslami | Danilo Jimenez Rezende | David P. Reichert | D. Hassabis | Daan Wierstra | T. Weber | Lars Buesing | S. Eslami | Fabio Viola | F. Besse | Karol Gregor | S. Racanière
[1] Ole Winther,et al. Sequential Neural Models with Stochastic Layers , 2016, NIPS.
[2] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[3] Uri Shalit,et al. Deep Kalman Filters , 2015, ArXiv.
[4] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[5] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[6] Daan Wierstra,et al. Recurrent Environment Simulators , 2017, ICLR.
[7] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[8] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[9] J. Betts. Survey of Numerical Methods for Trajectory Optimization , 1998 .
[10] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[11] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[12] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[13] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[14] Thomas B. Schön,et al. From Pixels to Torques: Policy Learning with Deep Dynamical Models , 2015, ICML 2015.
[15] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[16] Richard E. Turner,et al. Neural Adaptive Sequential Monte Carlo , 2015, NIPS.
[17] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[18] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[19] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[20] Erik Talvitie,et al. Agnostic System Identification for Monte Carlo Planning , 2015, AAAI.
[21] Il Memming Park,et al. BLACK BOX VARIATIONAL INFERENCE FOR STATE SPACE MODELS , 2015, 1511.07367.
[22] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[23] Sergey Levine,et al. Stochastic Variational Video Prediction , 2017, ICLR.
[24] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[25] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.
[26] Yann LeCun,et al. Model-Based Planning in Discrete Action Spaces , 2017, ArXiv.
[27] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[28] Yann LeCun,et al. Model-Based Planning with Discrete and Continuous Actions , 2017 .
[29] Martin A. Riedmiller,et al. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.
[30] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[31] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[33] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[34] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[35] Romain Laroche,et al. Hybrid Reward Architecture for Reinforcement Learning , 2017, NIPS.