暂无分享,去创建一个
Richard Socher | Hao Liu | Caiming Xiong | Alexander Trott | R. Socher | Alexander R. Trott | Caiming Xiong | Alexander Trott | Hao Liu
[1] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[2] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[3] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[4] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[5] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[6] Hao Liu,et al. Shrinkage-based Bias-Variance Trade-off for Deep Reinforcement Learning , 2018 .
[7] Roland Olsson,et al. Inductive Functional Programming Using Incremental Program Transformation , 1995, Artif. Intell..
[8] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[9] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[10] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[11] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[12] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[13] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[14] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[15] Joan Bruna,et al. Backplay: "Man muss immer umkehren" , 2018, ArXiv.
[16] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[17] Justin Fu,et al. EX2: Exploration with Exemplar Models for Deep Reinforcement Learning , 2017, NIPS.
[18] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[20] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[21] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[22] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[23] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[24] J. Elman. Learning and development in neural networks: the importance of starting small , 1993, Cognition.
[25] Stewart W. Wilson,et al. A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .
[26] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[27] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[28] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[29] Mo Chen,et al. BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[30] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[31] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[32] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[33] Ashley D. Edwards,et al. Forward-Backward Reinforcement Learning , 2018, ArXiv.
[34] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[35] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[36] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[37] Sergey Levine,et al. Recall Traces: Backtracking Models for Efficient Reinforcement Learning , 2018, ICLR.
[38] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[39] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..
[40] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[41] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[42] Julian Togelius,et al. Pommerman: A Multi-Agent Playground , 2018, AIIDE Workshops.
[43] Sergey Levine,et al. Path integral guided policy search , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[44] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.
[45] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[46] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[47] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[48] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[49] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[50] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[51] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[52] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[53] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[54] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[55] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[56] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[57] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[58] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[59] Jürgen Schmidhuber,et al. First Experiments with PowerPlay , 2012, Neural networks : the official journal of the International Neural Network Society.
[60] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.