An high-efficient online reinforcement learning algorithm for continuous-state systems
暂无分享,去创建一个
[1] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[2] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[3] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[4] Lihong Li,et al. PAC model-free reinforcement learning , 2006, ICML.
[5] Jason Pazis,et al. PAC Optimal Exploration in Continuous Space Markov Decision Processes , 2013, AAAI.
[6] Derong Liu,et al. Optimal control for discrete-time affine non-linear systems using general value iteration , 2012 .
[7] Andrey Bernstein,et al. Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains , 2010, Machine Learning.
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Michael L. Littman,et al. A theoretical analysis of Model-Based Interval Estimation , 2005, ICML.
[10] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[11] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.