Learning by Playing - Solving Sparse Reward Tasks from Scratch
暂无分享,去创建一个
Martin A. Riedmiller | Thomas Lampe | Jost Tobias Springenberg | Nicolas Heess | Roland Hafner | Michael Neunert | Jonas Degrave | Volodymyr Mnih | Tom Van de Wiele | N. Heess | Roland Hafner | T. Lampe | Volodymyr Mnih | Michael Neunert | Jonas Degrave | T. Wiele | J. T. Springenberg | Thomas Lampe
[1] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[2] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[3] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[4] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[5] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[6] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[7] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[8] Andrea Bonarini,et al. Transfer of samples in batch reinforcement learning , 2008, ICML '08.
[9] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[10] Sriraam Natarajan,et al. Transfer in variable-reward hierarchical reinforcement learning , 2008, Machine Learning.
[11] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[12] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[13] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[14] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[15] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[16] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[17] Jürgen Schmidhuber,et al. Learning skills from play: Artificial curiosity on a Katana robot arm , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[18] Tom Schaul,et al. Better Generalization with Forecasts , 2013, IJCAI.
[19] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[20] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[21] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[22] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[23] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[26] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[27] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[28] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[29] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[30] Sergey Levine,et al. Guided Policy Search via Approximate Mirror Descent , 2016, NIPS.
[31] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[32] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..
[33] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[34] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[35] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[36] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.
[37] Razvan Pascanu,et al. Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.
[38] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[39] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[40] Sergey Levine,et al. Unsupervised Perceptual Rewards for Imitation Learning , 2016, Robotics: Science and Systems.
[41] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[42] Peter Stone,et al. Autonomous Task Sequencing for Customized Curriculum Design in Reinforcement Learning , 2017, IJCAI.
[43] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[44] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[45] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[46] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[47] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[48] Misha Denil,et al. The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously , 2017, CoRL.
[49] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[50] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[51] Sergey Levine,et al. Sim2Real View Invariant Visual Servoing by Recurrent Control , 2017, ArXiv.
[52] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[53] Murray Shanahan,et al. Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[54] Peter Stone,et al. Stochastic Grounded Action Transformation for Robot Learning in Simulation , 2017, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).