暂无分享,去创建一个
[1] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[2] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[3] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[4] Lihong Li,et al. Incremental Model-based Learners With Formal Learning-Time Guarantees , 2006, UAI.
[5] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[6] J. Tsitsiklis,et al. An optimal multigrid algorithm for continuous state discrete time stochastic control , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.
[7] Michael L. Littman,et al. Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.
[8] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[9] Michael L. Littman,et al. Online Linear Regression and Its Application to Model-Based Reinforcement Learning , 2007, NIPS.
[10] Bethany R. Leffler,et al. Efficient Learning of Dynamics Models using Terrain Classification , 2008 .
[11] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[12] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[13] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..