PAC-inspired Option Discovery in Lifelong Reinforcement Learning
暂无分享,去创建一个
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[3] Sebastian Thrun,et al. Lifelong robot learning , 1993, Robotics Auton. Syst..
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[6] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[7] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[8] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[9] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[10] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[11] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[12] Vishal Soni,et al. Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains , 2006, AAAI.
[13] Lihong Li,et al. Incremental Model-based Learners With Formal Learning-Time Guarantees , 2006, UAI.
[14] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.
[15] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[16] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[17] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[18] Satinder P. Singh,et al. Transfer via soft homomorphisms , 2009, AAMAS.
[19] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[20] Alessandro Lazaric,et al. Transfer from Multiple MDPs , 2011, NIPS.
[21] Yoonsuck Choe,et al. Directed Exploration in Reinforcement Learning with Transferred Knowledge , 2012, EWRL.
[22] Tor Lattimore,et al. PAC Bounds for Discounted MDPs , 2012, ALT.
[23] Lihong Li,et al. Sample Complexity of Multi-task Reinforcement Learning , 2013, UAI.
[24] Shie Mannor,et al. Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations , 2014, ICML.