Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics
暂无分享,去创建一个
[1] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[2] M. Rehm,et al. Proceedings of AAMAS , 2005 .
[3] Doina Precup,et al. Bounding Performance Loss in Approximate MDP Homomorphisms , 2008, NIPS.
[4] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.
[5] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[6] Satinder P. Singh,et al. Transfer via soft homomorphisms , 2009, AAMAS.
[7] Doina Precup,et al. Using Bisimulation for Policy Transfer in MDPs , 2010, AAAI.
[8] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[9] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[10] Andrew G. Barto,et al. Causal Graph Based Decomposition of Factored MDPs , 2006, J. Mach. Learn. Res..
[11] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[12] John R. Anderson. ACT: A simple theory of complex cognition. , 1996 .
[13] Michael Wooldridge,et al. Proceedings of the 21st International Joint Conference on Artificial Intelligence , 2009 .
[14] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[15] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[16] Jean-Daniel Zucker,et al. Abstraction, Reformulation and Approximation, 6th International Symposium, SARA 2005, Airth Castle, Scotland, UK, July 26-29, 2005, Proceedings , 2005, SARA.
[17] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[18] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[19] Alicia P. Wolfe. Defining Object Types and Options Using MDP Homomorphisms , 2006 .
[20] Thomas G. Dietterich,et al. Automatic discovery and transfer of MAXQ hierarchies , 2008, ICML '08.
[21] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[22] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..
[23] Peng Zhou,et al. Discovering options from example trajectories , 2009, ICML '09.
[24] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[25] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[26] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[27] Scott Kuindersma,et al. Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories , 2010, NIPS.
[28] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[29] Vishal Soni,et al. Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains , 2006, AAAI.
[30] Doina Precup,et al. Optimal policy switching algorithms for reinforcement learning , 2010, AAMAS.
[31] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[32] Balaraman Ravindran,et al. Relativized Options: Choosing the Right Transformation , 2003, ICML.
[33] Doina Precup,et al. Methods for Computing State Similarity in Markov Decision Processes , 2006, UAI.
[34] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.