PAC model-free reinforcement learning
暂无分享,去创建一个
Lihong Li | Michael L. Littman | Alexander L. Strehl | John Langford | Eric Wiewiora | J. Langford | M. Littman | Lihong Li | Eric Wiewiora | A. Strehl | A. L. Strehl | Eric Wiewiora
[1] Claude-Nicolas Fiechter,et al. Efficient reinforcement learning , 1994, COLT '94.
[2] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[3] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.
[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[5] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[6] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[7] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[8] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[9] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[10] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.