暂无分享,去创建一个
Zhenghao Peng | Bolei Zhou | Hao Sun | Bolei Zhou | Zhenghao Peng | Hao Sun
[1] Greg Turk,et al. Learning Novel Policies For Tasks , 2019, ICML.
[2] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[3] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[4] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[5] Michael I. Jordan,et al. RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.
[6] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[7] Henryk Michalewski,et al. Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes , 2018, ISC.
[8] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[9] Brendan McCane,et al. VASE: Variational Assorted Surprise Exploration for Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[10] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[11] E. Paice,et al. Collaborative learning , 2003, Medical education.
[12] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[13] S. Shankar Sastry,et al. Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning , 2017, ArXiv.
[14] Hans-Paul Schwefel,et al. Evolution strategies – A comprehensive introduction , 2002, Natural Computing.
[15] Shie Mannor,et al. Distributional Policy Optimization: An Alternative Approach for Continuous Control , 2019, NeurIPS.
[16] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[17] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[18] Karen Simonyan,et al. Off-Policy Actor-Critic with Shared Experience Replay , 2020, ICML.
[19] Stéphane Doncieux,et al. Behavioral diversity with multiple behavioral distances , 2013, 2013 IEEE Congress on Evolutionary Computation.
[20] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[21] Zheng Wen,et al. Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..
[22] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[23] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[24] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[25] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[26] Kenneth O. Stanley,et al. Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.
[27] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[28] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[29] Kenneth O. Stanley,et al. Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.
[30] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[31] Robert Loftin,et al. Better Exploration with Optimistic Actor-Critic , 2019, NeurIPS.
[32] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[33] Tom Schaul,et al. Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.
[34] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[35] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[36] Finale Doshi-Velez,et al. Diversity-Inducing Policy Gradient: Using Maximum Mean Discrepancy to Find a Set of Diverse Policies , 2019, IJCAI.
[37] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[38] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[39] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[40] Lei Yu,et al. Diverse Exploration via Conjugate Policies for Policy Gradient Methods , 2019, AAAI.
[41] E. Cohen. Restructuring the Classroom: Conditions for Productive Small Groups , 1994 .
[42] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[43] Hao Wu,et al. Mastering Complex Control in MOBA Games with Deep Reinforcement Learning , 2019, AAAI.
[44] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[45] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[46] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[47] Zhang-Wei Hong,et al. Diversity-Driven Exploration Strategy for Deep Reinforcement Learning , 2018, NeurIPS.