Proto-value functions: developmental reinforcement learning
暂无分享,去创建一个
[1] S. Axler,et al. Harmonic Function Theory , 1992 .
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[4] S. Rosenberg. The Laplacian on a Riemannian Manifold: The Construction of the Heat Kernel , 1997 .
[5] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[6] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.
[7] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[8] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[9] Jitendra Malik,et al. Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[10] Alan M. Frieze,et al. Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.
[11] R. Coifman,et al. Diffusion Wavelets , 2004 .
[12] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[13] Sridhar Mahadevan,et al. Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions , 2005, NIPS.
[14] Alicia P. Wolfe,et al. Local Graph Partitioning as a Basis for Generating Temporally-Extended Actions in Reinforcement Learning , 2005 .