Natural Policy Gradient Methods with Parameter-based Exploration for Control Tasks
暂无分享,去创建一个
Isao Ono | Shigenobu Kobayashi | Atsushi Miyamae | Yuichi Nagata | Shigenobu Kobayashi | I. Ono | Y. Nagata | A. Miyamae | S. Kobayashi
[1] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[2] Shigenobu Kobayashi,et al. Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward , 1995, ICML.
[3] Shigenobu Kobayashi,et al. Reinforcement Learning in POMDPs with Function Approximation , 1997, ICML.
[4] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[5] Shigenobu Kobayashi,et al. Reinforcement learning for continuous action using stochastic gradient ascent , 1998 .
[6] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[7] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[8] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[9] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[10] Jeff G. Schneider,et al. Covariant Policy Search , 2003, IJCAI.
[11] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[12] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[13] Jin Yu,et al. Natural Actor-Critic for Road Traffic Optimisation , 2006, NIPS.
[14] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[15] Juha Karhunen,et al. Natural Conjugate Gradient in Variational Inference , 2007, ICONIP.
[16] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[17] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[18] Frank Sehnke,et al. Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.
[19] Christian Igel,et al. Variable Metric Reinforcement Learning Methods Applied to the Noisy Mountain Car Problem , 2008, EWRL.
[20] Jan Peters,et al. Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.
[21] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[22] Isao Ono,et al. Bidirectional Relation between CMA Evolution Strategies and Natural Evolution Strategies , 2010, PPSN.