Evolution-Guided Policy Gradient in Reinforcement Learning
暂无分享,去创建一个
[1] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[2] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[3] R. Mazo. On the theory of brownian motion , 1973 .
[4] Sebastian Risi,et al. Continual and One-Shot Learning Through Neural Networks with Dynamic External Memory , 2017, EvoApplications.
[5] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.
[6] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[7] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[8] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[9] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[10] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[11] Richard S. Sutton,et al. Multi-step Off-policy Learning Without Importance Sampling Ratios , 2017, ArXiv.
[12] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[13] Marc G. Bellemare,et al. Q(λ) with Off-Policy Corrections , 2016, ALT.
[14] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[15] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[16] Richard S. Sutton,et al. Directly Estimating the Variance of the {\lambda}-Return Using Temporal-Difference Methods , 2018 .
[17] Pierre-Yves Oudeyer,et al. GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms , 2017, ICML.
[18] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[19] Richard E. Turner,et al. Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning , 2017, NIPS.
[20] Richard S. Sutton,et al. Multi-step Reinforcement Learning: A Unifying Algorithm , 2017, AAAI.
[21] Antoine Cully,et al. Robots that can adapt like animals , 2014, Nature.
[22] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[23] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[24] Kenneth O. Stanley,et al. Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.
[25] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[26] Andreas Stafylopatis,et al. Autonomous vehicle navigation using evolutionary reinforcement learning , 1998, Eur. J. Oper. Res..
[27] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[28] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[29] Jian Peng,et al. Genetic Policy Optimization , 2017, ICLR 2018.
[30] Julian Togelius,et al. Neuroevolution in Games: State of the Art and Open Challenges , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[31] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[32] David H. Ackley,et al. Interactions between learning and evolution , 1991 .
[33] David Pfau,et al. Convolution by Evolution: Differentiable Pattern Producing Networks , 2016, GECCO.
[34] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[35] Dario Floreano,et al. Neuroevolution: from architectures to learning , 2008, Evol. Intell..
[36] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[37] Kenneth O. Stanley,et al. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.
[38] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[39] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[40] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[41] Shimon Whiteson,et al. Evolutionary Function Approximation for Reinforcement Learning , 2006, J. Mach. Learn. Res..
[42] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[43] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[44] Madalina M. Drugan,et al. Reinforcement learning versus evolutionary computation: A survey on hybrid algorithms , 2019, Swarm Evol. Comput..
[45] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[46] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[47] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[48] Thomas Bäck,et al. An Overview of Evolutionary Computation , 1993, ECML.
[49] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[50] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[51] Peter D. Turney,et al. Evolution, Learning, and Instinct: 100 Years of the Baldwin Effect , 1996, Evolutionary Computation.
[52] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[53] Chang Wook Ahn,et al. Elitism-based compact genetic algorithms , 2003, IEEE Trans. Evol. Comput..
[54] David B. Fogel,et al. Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .
[55] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[56] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[57] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[58] Oriol Vinyals,et al. Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.
[59] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[60] Thomas Bäck,et al. Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..
[61] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[62] Kenneth O. Stanley,et al. Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.