暂无分享,去创建一个
Tom Schaul | David Silver | Matteo Hessel | Gabriel Dulac-Arnold | Arthur Guez | Thomas Degris | Tim Harley | Hado van Hasselt | André Barreto | David P. Reichert | Neil C. Rabinowitz | T. Schaul | A. Guez | T. Degris | Matteo Hessel | H. V. Hasselt | André Barreto | Tim Harley | Gabriel Dulac-Arnold | David Silver
[1] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[2] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[4] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[6] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[7] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[8] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[10] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[11] Jürgen Schmidhuber,et al. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models , 2015, ArXiv.
[12] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[13] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[14] Tom Schaul,et al. Better Generalization with Forecasts , 2013, IJCAI.
[15] Richard S. Sutton,et al. Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..
[16] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .
[17] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[18] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[20] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[21] Daan Wierstra,et al. Recurrent Environment Simulators , 2017, ICLR.
[22] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[23] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[24] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[25] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.
[26] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[27] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.