Fourier Policy Gradients
暂无分享,去创建一个
Matthew Fellows | Shimon Whiteson | Kamil Ciosek | S. Whiteson | K. Ciosek | M. Fellows | Shimon Whiteson
[1] M. Stone. The Generalized Weierstrass Approximation Theorem , 1948 .
[2] Peter W. Glynn,et al. Likelihood ratio gradient estimation for stochastic systems , 1990, CACM.
[3] Jooyoung Park,et al. Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.
[4] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[5] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[6] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[7] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[8] Richard S. Sutton,et al. Comparing Policy-Gradient Algorithms , 2001 .
[9] Richard K. Beatson,et al. Reconstruction and representation of 3D objects with radial basis functions , 2001, SIGGRAPH.
[10] Elias M. Stein,et al. Fourier Analysis: An Introduction , 2003 .
[11] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[12] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] V. Milman,et al. A characterization of the Fourier transform and related topics , 2008 .
[15] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[16] Shimon Whiteson,et al. A theoretical and empirical analysis of Expected Sarsa , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[17] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[18] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[19] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[20] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[21] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[22] Koray Kavukcuoglu,et al. Multiple Object Recognition with Visual Attention , 2014, ICLR.
[23] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[24] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[25] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[26] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[27] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[28] Sergey Levine,et al. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.
[29] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[30] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[31] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[32] Shimon Whiteson,et al. Expected Policy Gradients , 2017, AAAI.
[33] Shimon Whiteson,et al. Expected Policy Gradients for Reinforcement Learning , 2018, J. Mach. Learn. Res..