Hindsight Experience Replay
暂无分享,去创建一个
Marcin Andrychowicz | Wojciech Zaremba | Pieter Abbeel | Peter Welinder | Josh Tobin | Bob McGrew | Dwight Crow | Rachel Fong | Alex Ray | Jonas Schneider | P. Abbeel | Marcin Andrychowicz | Wojciech Zaremba | Bob McGrew | Jonas Schneider | P. Welinder | Joshua Tobin | Dwight Crow | Alex Ray | Rachel Fong
[1] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[2] J. Elman. Learning and development in neural networks: the importance of starting small , 1993, Cognition.
[3] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] Jürgen Schmidhuber,et al. Optimal Ordered Problem Solver , 2002, Machine Learning.
[6] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[7] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[8] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.
[9] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[10] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .
[11] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[12] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[13] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[14] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[15] Jan Peters,et al. Nonamemanuscript No. (will be inserted by the editor) Reinforcement Learning to Adjust Parametrized Motor Primitives to , 2011 .
[16] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[17] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[18] Jürgen Schmidhuber,et al. First Experiments with PowerPlay , 2012, Neural networks : the official journal of the International Neural Network Society.
[19] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[20] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.
[21] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[22] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[23] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[24] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[25] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[26] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[27] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[28] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[29] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[30] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[31] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[32] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[33] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[34] Navdeep Jaitly,et al. Discrete Sequential Prediction of Continuous Actions for Deep RL , 2017, ArXiv.
[35] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[36] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[37] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[38] Sergey Levine,et al. Path integral guided policy search , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[39] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[40] Abhinav Gupta,et al. Learning to push by grasping: Using multiple tasks for effective learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[41] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[42] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[43] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[44] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[45] J. Schmidhuber,et al. Learning to Generate Focus Trajectories for Attentive Vision , 2019 .