Competitive Experience Replay
暂无分享,去创建一个
[1] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[2] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[3] Mo Chen,et al. BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[4] Sergey Levine,et al. Recall Traces: Backtracking Models for Efficient Reinforcement Learning , 2018, ICLR.
[5] Hao Liu,et al. Shrinkage-based Bias-Variance Trade-off for Deep Reinforcement Learning , 2018 .
[6] Julian Togelius,et al. Pommerman: A Multi-Agent Playground , 2018, AIIDE Workshops.
[7] Joan Bruna,et al. Backplay: "Man muss immer umkehren" , 2018, ArXiv.
[8] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[9] Ashley D. Edwards,et al. Forward-Backward Reinforcement Learning , 2018, ArXiv.
[10] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[11] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[12] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[13] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[14] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[15] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[16] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[17] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[18] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[19] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..
[20] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[21] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[22] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[23] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[24] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[25] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[26] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[27] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[28] Justin Fu,et al. EX2: Exploration with Exemplar Models for Deep Reinforcement Learning , 2017, NIPS.
[29] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[30] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[31] Sergey Levine,et al. Path integral guided policy search , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[32] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[33] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[34] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[35] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[36] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[37] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[38] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[39] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[40] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[41] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[42] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[43] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[44] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[45] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[46] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[47] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[48] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[49] Jürgen Schmidhuber,et al. First Experiments with PowerPlay , 2012, Neural networks : the official journal of the International Neural Network Society.
[50] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[51] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[52] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[53] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[54] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[55] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[56] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[57] Roland Olsson,et al. Inductive Functional Programming Using Incremental Program Transformation , 1995, Artif. Intell..
[58] J. Elman. Learning and development in neural networks: the importance of starting small , 1993, Cognition.
[59] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[60] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..