暂无分享,去创建一个
[1] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[2] Peter Stone,et al. Model-Based Exploration in Continuous State Spaces , 2007, SARA.
[3] L. Grüne. An adaptive grid scheme for the discrete Hamilton-Jacobi-Bellman equation , 1997 .
[4] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[5] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[6] Andrey Bernstein,et al. Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains , 2010, Machine Learning.
[7] Michael L. Littman,et al. Multi-resolution Exploration in Continuous Spaces , 2008, NIPS.
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Bart De Schutter,et al. Online least-squares policy iteration for reinforcement learning control , 2010, Proceedings of the 2010 American Control Conference.
[10] Lihong Li,et al. Online exploration in least-squares policy iteration , 2009, AAMAS.
[11] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[12] Scott Davies,et al. Multidimensional Triangulation and Interpolation for Reinforcement Learning , 1996, NIPS.
[13] Gary Boone,et al. Minimum-time control of the Acrobot , 1997, Proceedings of International Conference on Robotics and Automation.
[14] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[15] J. Weston,et al. Approximation Methods for Gaussian Process Regression , 2007 .
[16] N. Shimkin,et al. Adaptive-Resolution Reinforcement Learning with Efficient Exploration in Deterministic Domains , 2009 .
[17] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[18] Daniel Polani,et al. Learning RoboCup-Keepaway with Kernels , 2007, Gaussian Processes in Practice.
[19] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.
[20] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[21] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.