Goal-Conditioned Reinforcement Learning with Imagined Subgoals
暂无分享,去创建一个
[1] Bradly C. Stadie,et al. World Model as a Graph: Learning Latent Landmarks for Planning , 2021, ICML.
[2] Ilya Kostrikov,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ArXiv.
[3] Jimmy Ba,et al. Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning , 2020, ICML.
[4] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[5] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[6] Nando de Freitas,et al. Critic Regularized Regression , 2020, NeurIPS.
[7] Yee Whye Teh,et al. Exploiting Hierarchy for Learning and Transfer in KL-regularized RL , 2019, ArXiv.
[8] Sergey Levine,et al. Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning , 2019, CoRL.
[9] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[10] Pieter Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[11] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[12] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[13] Sergey Levine,et al. Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.
[14] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[15] Chelsea Finn,et al. Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation , 2019, ICLR.
[16] S. Levine,et al. Accelerating Online Reinforcement Learning with Offline Datasets , 2020, ArXiv.
[17] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[18] Sergey Levine,et al. Search on the Replay Buffer: Bridging Planning and Reinforcement Learning , 2019, NeurIPS.
[19] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[20] Satinder Singh,et al. Many-Goals Reinforcement Learning , 2018, ArXiv.
[21] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[22] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[23] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[24] Marc Toussaint,et al. On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference , 2012, Robotics: Science and Systems.
[25] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[26] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.
[27] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[28] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[29] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[30] Joseph J. Lim,et al. Accelerating Reinforcement Learning with Learned Skill Priors , 2020, CoRL.
[31] Yifan Wu,et al. Behavior Regularized Offline Reinforcement Learning , 2019, ArXiv.
[32] Yee Whye Teh,et al. Information asymmetry in KL-regularized RL , 2019, ICLR.
[33] Chelsea Finn,et al. Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors , 2020, NeurIPS.
[34] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[35] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[36] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[37] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[38] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[39] Jessica B. Hamrick,et al. Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning , 2020, ArXiv.
[40] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[41] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[42] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[43] Sergey Levine,et al. Planning with Goal-Conditioned Policies , 2019, NeurIPS.
[44] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[45] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[46] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[47] Rui Zhao,et al. Maximum Entropy-Regularized Multi-Goal Reinforcement Learning , 2019, ICML.
[48] Martin A. Riedmiller,et al. Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning , 2020, ICLR.
[49] Sergey Levine,et al. C-Learning: Learning to Achieve Goals via Recursive Classification , 2020, ICLR.
[50] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.