暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[3] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[4] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[5] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[6] DarrellTrevor,et al. End-to-end training of deep visuomotor policies , 2016 .
[7] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[8] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[9] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[11] Sergey Levine,et al. Recall Traces: Backtracking Models for Efficient Reinforcement Learning , 2018, ICLR.
[12] Weinan Zhang,et al. MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence , 2017, AAAI.
[13] Tobias Ley,et al. Enhanced Experience Replay Generation for Efficient Reinforcement Learning , 2017, ArXiv.
[14] Peter Stone,et al. Learning Curriculum Policies for Reinforcement Learning , 2018, AAMAS.
[15] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[16] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[17] M. Rosenblatt. Remarks on Some Nonparametric Estimates of a Density Function , 1956 .
[18] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[19] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[20] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[23] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[24] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[25] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[26] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[27] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[28] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[29] Joan Bruna,et al. Backplay: "Man muss immer umkehren" , 2018, ArXiv.
[30] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[31] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[32] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[33] Jason Weston,et al. Curriculum learning , 2009, ICML '09.