暂无分享,去创建一个
[1] Andrew G. Barto,et al. Conjugate Markov Decision Processes , 2011, ICML.
[2] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.
[3] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[4] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.
[5] David A. Huffman,et al. A method for the construction of minimum-redundancy codes , 1952, Proceedings of the IRE.
[6] Donald J. Berndt,et al. Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.
[7] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[8] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[9] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[10] Balaraman Ravindran,et al. Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks , 2016, ArXiv.
[11] Justin Fu,et al. EX2: Exploration with Exemplar Models for Deep Reinforcement Learning , 2017, NIPS.
[12] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[13] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[14] Andrew G. Barto,et al. Autonomous shaping: knowledge transfer in reinforcement learning , 2006, ICML.
[15] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[16] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[17] Terry A. Welch,et al. A Technique for High-Performance Data Compression , 1984, Computer.
[18] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[19] Steven D. Whitehead,et al. Complexity and Cooperation in Q-Learning , 1991, ML.
[20] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[21] Michael Buro,et al. On the Maximum Length of Huffman Codes , 1993, Inf. Process. Lett..
[22] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[23] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[24] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[25] Nathan R. Sturtevant,et al. Benchmarks for Grid-Based Pathfinding , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[26] M. Anand. “1984” , 1962 .
[27] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.
[28] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[29] R. Sutton,et al. Macro-Actions in Reinforcement Learning: An Empirical Analysis , 1998 .