A unified framework for temporal difference methods
暂无分享,去创建一个
[1] David Choi,et al. A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning , 2001, Discret. Event Dyn. Syst..
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] A. Barto,et al. Improved Temporal Difference Methods with Linear Function Approximation , 2004 .
[4] Dimitri P. Bertsekas,et al. Convergence Results for Some Temporal Difference Methods Based on Least Squares , 2009, IEEE Transactions on Automatic Control.
[5] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[6] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[7] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[8] D. Bertsekas,et al. Journal of Computational and Applied Mathematics Projected Equation Methods for Approximate Solution of Large Linear Systems , 2022 .
[9] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[10] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[11] S. Ioffe,et al. Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming , 1996 .
[12] Dimitri P. Bertsekas,et al. Temporal Dierences-Based Policy Iteration and Applications in Neuro-Dynamic Programming 1 , 1997 .
[13] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[14] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[15] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[16] Andrew G. Barto,et al. Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.
[17] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[18] D. Bertsekas,et al. A Least Squares Q-Learning Algorithm for Optimal Stopping Problems , 2007 .
[19] D. Bertsekas,et al. Projection methods for variational inequalities with application to the traffic assignment problem , 1982 .
[20] F. Facchinei,et al. Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .
[21] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[22] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .