Compositional Transfer in Hierarchical Reinforcement Learning
暂无分享,去创建一个
Martin A. Riedmiller | Jost Tobias Springenberg | N. Heess | Roland Hafner | T. Lampe | Markus Wulfmeier | A. Abdolmaleki | Michael Neunert | Tim Hertweck | Noah Siegel | J. T. Springenberg | Thomas Lampe
[1] John R. Anderson,et al. The Transfer of Cognitive Skill , 1989 .
[2] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[3] C. Bishop. Mixture density networks , 1994 .
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[6] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[7] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[8] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[9] Thomas G. Dietterich,et al. To transfer or not to transfer , 2005, NIPS 2005.
[10] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[11] Emilio Soria Olivas,et al. Handbook of Research on Machine Learning Applications and Trends : Algorithms , Methods , and Techniques , 2009 .
[12] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[13] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[14] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[15] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[16] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[17] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[21] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[22] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[23] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[24] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[25] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.
[26] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[27] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[28] Pieter Abbeel,et al. Mutual Alignment Transfer Learning , 2017, CoRL.
[29] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[30] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[31] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[32] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[33] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[34] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[35] Shimon Whiteson,et al. TACO: Learning Task Decomposition via Temporal Alignment for Control , 2018, ICML.
[36] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[37] Joelle Pineau,et al. An Inference-Based Policy Gradient Method for Learning Options , 2018, ICML.
[38] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[39] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[40] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[41] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[42] Doina Precup,et al. The Termination Critic , 2019, AISTATS.
[43] Yee Whye Teh,et al. Information asymmetry in KL-regularized RL , 2019, ICLR.
[44] Jaime G. Carbonell,et al. Characterizing and Avoiding Negative Transfer , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Yee Whye Teh,et al. Exploiting Hierarchy for Learning and Transfer in KL-regularized RL , 2019, ArXiv.
[46] Sergio Gomez Colmenarejo,et al. TF-Replicator: Distributed Machine Learning for Researchers , 2019, ArXiv.
[47] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[48] Michael Figurnov,et al. Monte Carlo Gradient Estimation in Machine Learning , 2019, J. Mach. Learn. Res..
[49] Shimon Whiteson,et al. Multitask Soft Option Learning , 2019, UAI.
[50] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..