Online Linear Regression and Its Application to Model-Based Reinforcement Learning
暂无分享,去创建一个
[1] J. Tsitsiklis,et al. An optimal one-way multigrid algorithm for discrete-time stochastic control , 1991 .
[2] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[3] Claude-Nicolas Fiechter,et al. PAC adaptive control of linear systems , 1997, COLT '97.
[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[5] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[6] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[7] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[8] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[9] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[10] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[11] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[12] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.