Context-Aware Policy Reuse
暂无分享,去创建一个
Siyuan Li | Chongjie Zhang | Fangda Gu | Guangxiang Zhu | Chongjie Zhang | Guangxiang Zhu | Siyuan Li | Fangda Gu
[1] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[2] Yang Gao,et al. Measuring the Distance Between Finite Markov Decision Processes , 2016, AAMAS.
[3] Emilio Soria Olivas,et al. Handbook of Research on Machine Learning Applications and Trends : Algorithms , Methods , and Techniques , 2009 .
[4] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[5] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[6] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[7] S. Shankar Sastry,et al. A Multi-Armed Bandit Approach for Online Expert Selection in Markov Decision Processes , 2017, ArXiv.
[8] Doina Precup,et al. Optimal policy switching algorithms for reinforcement learning , 2010, AAMAS.
[9] Razvan Pascanu,et al. Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[10] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[11] Eric Eaton,et al. Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment , 2015, AAAI.
[12] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.
[13] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[14] Shie Mannor,et al. Time-Regularized Interrupting Options (TRIO) , 2014, ICML.
[15] Doina Precup,et al. Learning with Options that Terminate Off-Policy , 2017, AAAI.
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[18] Gavriel Salomon,et al. T RANSFER OF LEARNING , 1992 .
[19] Benjamin Rosman,et al. Bayesian policy reuse , 2015, Machine Learning.
[20] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[21] Tie-Yan Liu,et al. Target Transfer Q-Learning and Its Convergence Analysis , 2018, Neurocomputing.
[22] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[23] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[24] Alessandro Lazaric,et al. Regret Bounds for Reinforcement Learning with Policy Advice , 2013, ECML/PKDD.
[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[26] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[27] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[28] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[29] Lihong Li,et al. Sample Complexity of Multi-task Reinforcement Learning , 2013, UAI.
[30] Ioannis P. Vlahavas,et al. Transfer learning with probabilistic mapping selection , 2015, Adapt. Behav..
[31] Matthew E. Taylor,et al. Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence , 2014, AAAI.
[32] Eric Eaton,et al. An automated measure of MDP similarity for transfer in reinforcement learning , 2014, AAAI 2014.
[33] Manuela M. Veloso,et al. Learning domain structure through probabilistic policy reuse in reinforcement learning , 2013, Progress in Artificial Intelligence.
[34] Siyuan Li,et al. An Optimal Online Method of Selecting Source Policies for Reinforcement Learning , 2017, AAAI.
[35] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[36] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[37] Romain Laroche,et al. Transfer Reinforcement Learning with Shared Dynamics , 2017, AAAI.
[38] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[39] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.