暂无分享,去创建一个
[1] Gang Niu,et al. Analysis and Improvement of Policy Gradient Estimation , 2011, NIPS.
[2] Jürgen Schmidhuber,et al. Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .
[3] Frank Sehnke,et al. Baseline-Free Sampling in Parameter Exploring Policy Gradients: Super Symmetric PGPE , 2015 .
[4] Tom Schaul,et al. Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.
[5] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Isao Ono,et al. Natural Policy Gradient Methods with Parameter-based Exploration for Control Tasks , 2010, NIPS.
[8] Ronald L. Wasserstein,et al. Monte Carlo: Concepts, Algorithms, and Applications , 1997 .
[9] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[10] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[11] Frank Sehnke,et al. Multimodal Parameter-exploring Policy Gradients , 2010, 2010 Ninth International Conference on Machine Learning and Applications.
[12] Jun Morimoto,et al. Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration , 2012, Neural Computation.
[13] Andreas Zell,et al. Automatic Calibration of Camera to World Mapping in RoboCup using Evolutionary Algorithms , 2006, 2006 IEEE International Conference on Evolutionary Computation.
[14] Peter Henderson,et al. A lazy evaluator , 1976, POPL.
[15] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[16] Tom Schaul,et al. Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients , 2010, ICANN.
[17] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[18] Frank Sehnke. Parameter exploring policy gradients and their implications , 2012 .
[19] Andreas Zell,et al. An Automatic Approach to Online Color Training in RoboCup Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[20] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..