Representation Policy Iteration
暂无分享,去创建一个
[1] Craig Boutilier,et al. Greedy linear value-approximation for factored Markov decision processes , 2002, AAAI/IAAI.
[2] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..
[3] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.
[4] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.
[5] Mikhail Belkin,et al. Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.
[6] S. Axler,et al. Harmonic Function Theory , 1992 .
[7] S. Rosenberg. The Laplacian on a Riemannian Manifold: The Laplacian on a Riemannian Manifold , 1997 .
[8] Sridhar Mahadevan,et al. Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions , 2005, NIPS.
[9] Fan Chung,et al. Spectral Graph Theory , 1996 .
[10] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[11] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.
[12] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[13] Sridhar Mahadevan,et al. Samuel Meets Amarel: Automating Value Function Approximation Using Global State Space Analysis , 2005, AAAI.
[14] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[15] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[16] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[17] Sridhar Mahadevan,et al. Fast direct policy evaluation using multiscale analysis of Markov diffusion processes , 2006, ICML.
[18] R. Coifman,et al. Diffusion Wavelets , 2004 .
[19] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..