GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
暂无分享,去创建一个
[1] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[2] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[3] Kenneth O. Stanley,et al. Abandoning Objectives: Evolution Through the Search for Novelty Alone , 2011, Evolutionary Computation.
[4] Filip De Turck,et al. Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks , 2016, ArXiv.
[5] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[6] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[7] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[8] Pierre-Yves Oudeyer,et al. Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..
[9] Kenneth O. Stanley,et al. ES is more than just a traditional finite-difference approximator , 2017, GECCO.
[10] Kenneth O. Stanley,et al. Confronting the Challenge of Quality Diversity , 2015, GECCO.
[11] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[12] Pierre-Yves Oudeyer,et al. The strategic student approach for life-long exploration and learning , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[13] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[14] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[16] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[17] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[18] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[19] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[20] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[21] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[22] Pierre-Yves Oudeyer,et al. Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration , 2018, ICLR.
[23] Sergey Levine,et al. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.
[24] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[25] Matthieu Zimmer,et al. Bootstrapping $Q$ -Learning for Robotics From Neuro-Evolution Results , 2018, IEEE Transactions on Cognitive and Developmental Systems.
[26] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[27] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[28] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[29] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[30] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning , 2017, J. Mach. Learn. Res..
[31] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[32] Olivier Sigaud,et al. Robot Skill Learning: From Reinforcement Learning to Evolution Strategies , 2013, Paladyn J. Behav. Robotics.
[33] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[34] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[35] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[36] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[37] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[38] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[39] Yiannis Demiris,et al. Quality and Diversity Optimization: A Unifying Modular Framework , 2017, IEEE Transactions on Evolutionary Computation.
[40] Kenneth O. Stanley,et al. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.
[41] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[42] Pierre-Yves Oudeyer,et al. Modular active curiosity-driven discovery of tool use , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).