Policy Gradient Methods for Reinforcement Learning with Function Approximation
暂无分享,去创建一个
Yishay Mansour | Richard S. Sutton | David A. McAllester | Satinder P. Singh | R. Sutton | Satinder Singh | Y. Mansour
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[3] David S. Touretzky,et al. Connectionist models : proceedings of the 1990 summer school , 1991 .
[4] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[5] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[6] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[7] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[8] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[9] Xi-Ren Cao,et al. Perturbation realization, potentials, and sensitivity analysis of Markov processes , 1997, IEEE Trans. Autom. Control..
[10] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.
[11] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[12] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[13] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[14] P. Bartlett,et al. Direct Gradient-Based Reinforcement Learning: I. Gradient Estimation Algorithms , 1999 .
[15] J. Baxter,et al. Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).