Policy search via the signed derivative
暂无分享,去创建一个
[1] S. Sastry,et al. Adaptive Control: Stability, Convergence and Robustness , 1989 .
[2] B. Pasik-Duncan,et al. Adaptive Control , 1996, IEEE Control Systems.
[3] Kevin L. Moore,et al. Iterative Learning Control: An Expository Overview , 1999 .
[4] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[5] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[6] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[7] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[8] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[9] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.
[10] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[11] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.
[12] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[13] Russ Tedrake,et al. Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms , 2008, NIPS.
[14] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .