暂无分享,去创建一个
Yarin Gal | Angelos Filos | Owain Evans | Zachary Kenton | Angelos Filos | Y. Gal | Zachary Kenton | Owain Evans
[1] Shimon Whiteson,et al. Fingerprint Policy Optimisation for Robust Reinforcement Learning , 2018, ICML.
[2] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[3] Jianfeng Gao,et al. End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.
[4] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[5] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[6] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.
[7] Sergey Levine,et al. Uncertainty-Aware Reinforcement Learning for Collision Avoidance , 2017, ArXiv.
[8] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[9] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.
[10] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[13] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[14] Laurent Orseau,et al. AI Safety Gridworlds , 2017, ArXiv.
[15] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[16] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[19] Samy Bengio,et al. A Study on Overfitting in Deep Reinforcement Learning , 2018, ArXiv.
[20] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[21] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[22] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[23] Marta Z. Kwiatkowska,et al. Evaluating Uncertainty Quantification in End-to-End Autonomous Driving Control , 2018, ArXiv.
[24] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[25] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[26] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[27] Yoshua Bengio,et al. On the Optimization of a Synaptic Learning Rule , 2007 .
[28] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.
[29] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[31] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[32] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[33] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[34] Marlos C. Machado,et al. Generalization and Regularization in DQN , 2018, ArXiv.
[35] Jianfeng Gao,et al. Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear , 2016, ArXiv.