论文信息 - Online Linear Regression and Its Application to Model-Based Reinforcement Learning

Online Linear Regression and Its Application to Model-Based Reinforcement Learning

We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a model-based approach and show that a special type of online linear regression allows us to learn MDPs with (possibly kernalized) linearly parameterized dynamics. This result builds on Kearns and Singh's work that provides a provably efficient algorithm for finite state MDPs. Our approach is not restricted to the linear setting, and is applicable to other classes of continuous MDPs.

Michael L. Littman | Alexander L. Strehl | M. Littman | A. Strehl

[1] J. Tsitsiklis,et al. An optimal one-way multigrid algorithm for discrete-time stochastic control , 1991 .

[2] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[3] Claude-Nicolas Fiechter,et al. PAC adaptive control of linear systems , 1997, COLT '97.

[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[5] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[6] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[7] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.

[8] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .

[9] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.

[10] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[11] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[12] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.