Using Gaussian Processes for Variance Reduction in Policy Gradient Algorithms
暂无分享,去创建一个
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] Mohammad Ghavamzadeh,et al. Bayesian Policy Gradient Algorithms , 2006, NIPS.
[3] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[4] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[5] L. Csató. Gaussian processes:iterative sparse approximations , 2002 .
[6] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[7] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[8] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[9] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[10] Opper. On-line versus Off-line Learning from Random Examples: General Results. , 1996, Physical review letters.
[11] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[12] Manfred Opper,et al. Sparse Representation for Gaussian Process Models , 2000, NIPS.
[13] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[14] Mohammad Ghavamzadeh,et al. Bayesian actor-critic algorithms , 2007, ICML '07.
[15] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[16] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[17] Ole Winther,et al. Efficient Approaches to Gaussian Process Classification , 1999, NIPS.