An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] L. Baird,et al. A MATHEMATICAL ANALYSIS OF ACTOR-CRITIC ARCHITECTURES FOR LEARNING OPTIMAL CONTROLS THROUGH INCREMENTAL DYNAMIC PROGRAMMING , 1990 .
[3] Richard S. Sutton,et al. Reinforcement learning architectures for animats , 1991 .
[4] Paul E. Utgoff,et al. A Teaching Method for Reinforcement Learning , 1992, ML.
[5] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[6] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[7] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[8] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[9] Andrew G. Barto,et al. An Actor/Critic Algorithm that is Equivalent to Q-Learning , 1994, NIPS.
[10] Shigenobu Kobayashi,et al. Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward , 1995, ICML.
[11] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[12] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[13] Chin-Teng Lin,et al. Reinforcement learning for an ART-based fuzzy adaptive learning control network , 1996, IEEE Trans. Neural Networks.
[14] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[15] Kenji Doya,et al. Efficient Nonlinear Control with Actor-Tutor Architecture , 1996, NIPS.
[16] Mark D. Pendrith,et al. Actual Return Reinforcement Learning versus Temporal Differences: Some Theoretical and Experimental Results , 1996, ICML.
[17] Shigenobu Kobayashi,et al. Reinforcement Learning in POMDPs with Function Approximation , 1997, ICML.