暂无分享,去创建一个
Hanbo Zhang | Xuguang Lan | Deyu Yang | Jishiyu Ding | Deyu Yang | Xuguang Lan | Jishiyu Ding | Hanbo Zhang
[1] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[2] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[3] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[4] Andreas Krause,et al. Lazier Than Lazy Greedy , 2014, AAAI.
[5] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[6] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[7] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[8] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[9] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[10] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[11] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[12] Lei Han,et al. Curriculum-guided Hindsight Experience Replay , 2019, NeurIPS.
[13] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[14] Stuart J. Russell,et al. MADE: Exploration via Maximizing Deviation from Explored Regions , 2021, NeurIPS.
[15] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[18] Balaraman Ravindran,et al. MaMiC: Macro and Micro Curriculum for Robotic Reinforcement Learning , 2019, AAMAS.
[19] Wojciech Jaskowski,et al. Model-Based Active Exploration , 2018, ICML.
[20] Jimmy Ba,et al. Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning , 2020, ICML.
[21] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[22] S. Shankar Sastry,et al. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning , 2017, ArXiv.
[23] Pieter Abbeel,et al. Mutual Information State Intrinsic Control , 2021, ICLR.
[24] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[25] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[26] Sergey Levine,et al. EMI: Exploration with Mutual Information , 2018, ICML.
[27] Zongqing Lu,et al. Generative Exploration and Exploitation , 2020, AAAI.
[28] Sergey Levine,et al. Efficient Exploration via State Marginal Matching , 2019, ArXiv.
[29] Pierre-Yves Oudeyer,et al. Automatic Curriculum Learning For Deep RL: A Short Survey , 2020, IJCAI.
[30] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[31] Jinwoo Shin,et al. State Entropy Maximization with Random Encoders for Efficient Exploration , 2021, ICML.
[32] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[33] M. Rosenblatt. Remarks on Some Nonparametric Estimates of a Density Function , 1956 .
[34] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[35] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[36] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[37] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[38] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[39] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[40] Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals , 2020, ArXiv.
[41] Pieter Abbeel,et al. Automatic Curriculum Learning through Value Disagreement , 2020, NeurIPS.
[42] Cordelia Schmid,et al. Goal-Conditioned Reinforcement Learning with Imagined Subgoals , 2021, ICML.
[43] Volker Tresp,et al. Energy-Based Hindsight Experience Prioritization , 2018, CoRL.
[44] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[45] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[46] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[47] Julian Togelius,et al. Deep Learning for Video Game Playing , 2017, IEEE Transactions on Games.
[48] Yuan Zhou,et al. Exploration via Hindsight Goal Generation , 2019, NeurIPS.
[49] Richard A. Davis,et al. Remarks on Some Nonparametric Estimates of a Density Function , 2011 .
[50] Giovanni Montana,et al. PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals , 2020, NeurIPS.
[51] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[52] Rui Zhao,et al. Maximum Entropy-Regularized Multi-Goal Reinforcement Learning , 2019, ICML.
[53] Jihong Zhu,et al. Hindsight Planner , 2020, AAMAS.
[54] Justin Fu,et al. EX2: Exploration with Exemplar Models for Deep Reinforcement Learning , 2017, NIPS.