Analysis of temporal-difference learning with function approximation
暂无分享,去创建一个
[1] Guy Pujolle,et al. Introduction to queueing networks , 1987 .
[2] J. Ben Atkinson,et al. An Introduction to Queueing Networks , 1988 .
[3] M. V. Rossum,et al. In Neural Computation , 2022 .
[4] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[5] J. Tsitsiklis,et al. On the settling time of the congested GI/G/1 queue , 1990 .
[6] P. Konstantopoulos,et al. On the cut-off phenomenon in some queueing systems , 1991 .
[7] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[8] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[9] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[10] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[11] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[12] Dimitri P. Bertsekas,et al. A Counterexample to Temporal Differences Learning , 1995, Neural Computation.
[13] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[14] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[15] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[16] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[17] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[18] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.