Least Squares Policy Evaluation Algorithms with Linear Function Approximation
暂无分享,去创建一个
[1] Emanuel Parzen,et al. Modern Probability Theory And Its Applications , 1962 .
[2] L. Sucheston. Modern Probability Theory and its Applications. , 1961 .
[3] J. Neveu,et al. Discrete Parameter Martingales , 1975 .
[4] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[5] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .
[6] Robert G. Gallager,et al. Discrete Stochastic Processes , 1995 .
[7] Dimitri P. Bertsekas,et al. A Counterexample to Temporal Differences Learning , 1995, Neural Computation.
[8] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[9] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[10] S. Ioffe,et al. Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming , 1996 .
[11] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[12] Dimitri P. Bertsekas,et al. Temporal Dierences-Based Policy Iteration and Applications in Neuro-Dynamic Programming 1 , 1997 .
[13] John N. Tsitsiklis,et al. Gradient Convergence in Gradient methods with Errors , 1999, SIAM J. Optim..
[14] Michael I. Jordan,et al. On the Convergence of Temporal-Difference Learning with Linear Function Approximation , 2001 .
[15] Dudley,et al. Real Analysis and Probability: Measurability: Borel Isomorphism and Analytic Sets , 2002 .
[16] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[17] Terrence J. Sejnowski,et al. TD(λ) Converges with Probability 1 , 1994, Machine Learning.
[18] Vladislav Tadic,et al. On the Convergence of Temporal-Difference Learning with Linear Function Approximation , 2001, Machine Learning.
[19] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[20] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[21] B. Nordstrom. FINITE MARKOV CHAINS , 2005 .