暂无分享,去创建一个
Yujing Hu | Jianye Hao | Zhaopeng Meng | Tianpei Yang | Weixun Wang | Zongzhang Zhang | Changjie Fan | Yingfeng Chen | Zhaodong Wang | Jiajie Peng
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[3] Matthew E. Taylor,et al. Policy Transfer using Reward Shaping , 2015, AAMAS.
[4] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[5] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[6] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Andrew Zisserman,et al. Kickstarting Deep Reinforcement Learning , 2018, ArXiv.
[11] Siyuan Li,et al. An Optimal Online Method of Selecting Source Policies for Reinforcement Learning , 2017, AAAI.
[12] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.
[13] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[14] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[15] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[16] Doina Precup,et al. Learnings Options End-to-End for Continuous Action Tasks , 2017, ArXiv.
[17] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[18] Peter Stone,et al. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..
[19] Shie Mannor,et al. Time-Regularized Interrupting Options (TRIO) , 2014, ICML.
[20] Philip Thomas,et al. Bias in Natural Actor-Critic Algorithms , 2014, ICML.
[21] Pushmeet Kohli,et al. CompILE: Compositional Imitation Learning and Execution , 2018, ICML.
[22] Siyuan Li,et al. Context-Aware Policy Reuse , 2018, AAMAS.
[23] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[24] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[25] Doina Precup,et al. The Termination Critic , 2019, AISTATS.
[26] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[27] Balaraman Ravindran,et al. Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain , 2015, ICLR.
[28] Yang Gao,et al. Measuring the Distance Between Finite Markov Decision Processes , 2016, AAMAS.
[29] Saurabh Kumar,et al. Learning to Compose Skills , 2017, ArXiv.
[30] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[31] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[32] Gaurav S. Sukhatme,et al. Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets , 2017, NIPS.
[33] Romain Laroche,et al. Transfer Reinforcement Learning with Shared Dynamics , 2017, AAAI.
[34] Sinno Jialin Pan,et al. Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay , 2017, AAAI.