暂无分享,去创建一个
Yali Du | Rui Yang | Lei Han | Feng Luo | Xiu Li | Meng Fang | Meng Fang | Yali Du | Lei Han | Rui Yang | Xiu Li | Feng Luo
[1] Pieter Abbeel,et al. Generalized Hindsight for Reinforcement Learning , 2020, NeurIPS.
[2] Rui Zhao,et al. Maximum Entropy-Regularized Multi-Goal Reinforcement Learning , 2019, ICML.
[3] Pieter Abbeel,et al. Visual Hindsight Experience Replay , 2019, ArXiv.
[4] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[5] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[6] Sergey Levine,et al. Learning to Reach Goals via Iterated Supervised Learning , 2019, ICLR.
[7] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[8] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[9] Jiangpeng Yan,et al. Efficient Continuous Control with Double Actors and Regularized Critics , 2021, AAAI.
[10] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[11] Tom Schaul,et al. Better Generalization with Forecasts , 2013, IJCAI.
[12] Giovanni Montana,et al. PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals , 2020, NeurIPS.
[13] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[14] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[15] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[16] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[17] Sergey Levine,et al. Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement , 2020, NeurIPS.
[18] Lei Han,et al. Curriculum-guided Hindsight Experience Replay , 2019, NeurIPS.
[19] Michael I. Jordan,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.
[20] Yong Yu,et al. MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks , 2021, IJCAI.
[21] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[22] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[23] Jimmy Ba,et al. Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning , 2020, ICML.
[24] Xiaotong Liu,et al. Policy Continuation with Hindsight Inverse Dynamics , 2019, NeurIPS.
[25] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[26] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[27] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[28] Pieter Abbeel,et al. Goal-conditioned Imitation Learning , 2019, NeurIPS.
[29] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[30] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[31] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[32] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[33] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.