Bayesian actor-critic algorithms
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] A. O'Hagan,et al. Bayes–Hermite quadrature , 1991 .
[3] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[5] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[6] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[7] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[8] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[9] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[10] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[11] Yaakov Engel,et al. Algorithms and representations for reinforcement learning (עם תקציר בעברית, תכן ושער נוסף: אלגוריתמים וייצוגים ללמידה מחיזוקים.; אלגוריתמים וייצוגים ללמידה מחיזוקים.) , 2005 .
[12] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[13] Mohammad Ghavamzadeh,et al. Bayesian Policy Gradient Algorithms , 2006, NIPS.