Optimality of Reinforcement Learning Algorithms with Linear Function Approximation
暂无分享,去创建一个
[1] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[2] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[3] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[4] Anne Greenbaum,et al. Iterative methods for solving linear systems , 1997, Frontiers in applied mathematics.
[5] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[6] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.
[7] Michail G. Lagoudakis,et al. Model-Free Least-Squares Policy Iteration , 2001, NIPS.
[8] Artur Merke,et al. Convergent Combinations of Reinforcement Learning with Linear Function Approximation , 2002, NIPS.
[9] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.