暂无分享,去创建一个
Tom Eccles | Laurent Orseau | David Silver | Karol Gregor | Arthur Guez | Greg Wayne | Sébastien Racanière | Mehdi Mirza | Timothy P. Lillicrap | Adam Santoro | David Raposo | Rishabh Kabra | Théophane Weber | T. Lillicrap | Greg Wayne | A. Guez | T. Weber | David Raposo | Adam Santoro | Laurent Orseau | Tom Eccles | Rishabh Kabra | Mehdi Mirza | David Silver | Karol Gregor | S. Racanière | D. Raposo
[1] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[2] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[5] Joelle Pineau,et al. A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning , 2018, ArXiv.
[6] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[7] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[8] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[9] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[10] Jürgen Schmidhuber,et al. Highway and Residual Networks learn Unrolled Iterative Estimation , 2016, ICLR.
[11] J. Schmidhuber. Making the World Differentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environm~nts , 2018 .
[12] Samy Bengio,et al. A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.
[13] Laurent Orseau,et al. Single-Agent Policy Tree Search With Guarantees , 2018, NeurIPS.
[14] X. Pang,et al. Neural network design for J function approximation in dynamic programming , 1998, adap-org/9806001.
[15] Eric P. Xing,et al. Gated Path Planning Networks , 2018, ICML.
[16] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Kavosh Asadi,et al. Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[20] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[21] Joel Z. Leibo,et al. Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.
[22] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[23] Shimon Whiteson,et al. TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning , 2017, ICLR 2018.
[24] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[25] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.
[26] Jonathan Schaeffer,et al. Using Abstraction for Planning in Sokoban , 2002, Computers and Games.
[27] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[28] Rémi Munos,et al. Learning to Search with MCTSnets , 2018, ICML.
[29] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.
[30] Sergey Levine,et al. Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control , 2018, ArXiv.
[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[32] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[33] Fabio Viola,et al. Learning and Querying Fast Generative Models for Reinforcement Learning , 2018, ArXiv.
[34] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[35] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[36] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[37] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[38] Yoshua Bengio,et al. Residual Connections Encourage Iterative Inference , 2017, ICLR.
[39] Razvan Pascanu,et al. Relational Deep Reinforcement Learning , 2018, ArXiv.