An Analysis of Laplacian Methods for Value Function Approximation in MDPs
暂无分享,去创建一个
[1] Gene H. Golub,et al. Matrix computations , 1983 .
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] Yousef Saad,et al. Preconditioned Krylov Subspace Methods For CFD Applications , 1995 .
[4] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[5] Y. Saad. Analysis of Augmented Krylov Subspace Methods , 1997, SIAM J. Matrix Anal. Appl..
[6] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] Ilse C. F. Ipsen,et al. THE IDEA BEHIND KRYLOV METHODS , 1998 .
[9] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[10] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[11] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[12] Sridhar Mahadevan,et al. Samuel Meets Amarel: Automating Value Function Approximation Using Global State Space Analysis , 2005, AAAI.
[13] Sridhar Mahadevan,et al. Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions , 2005, NIPS.
[14] Sridhar Mahadevan,et al. Learning Representation and Control in Continuous Markov Decision Processes , 2006, AAAI.
[15] Sridhar Mahadevan,et al. Fast direct policy evaluation using multiscale analysis of Markov diffusion processes , 2006, ICML.