Dead-ends and Secure Exploration in Reinforcement Learning
暂无分享,去创建一个
Harm van Seijen | Mehdi Fatemi | Samira Ebrahimi Kahou | Shikhar Sharma | S. Kahou | H. V. Seijen | Mehdi Fatemi | Shikhar Sharma
[1] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[2] Romain Laroche,et al. Scaling up budgeted reinforcement learning , 2019, ArXiv.
[3] Marek Petrik,et al. Safe Policy Improvement by Minimizing Robust Baseline Regret , 2016, NIPS.
[4] E. Altman. Constrained Markov Decision Processes , 1999 .
[5] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[6] Laurent Orseau,et al. Safely Interruptible Agents , 2016, UAI.
[7] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[8] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[9] Ralph Neuneier,et al. Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.
[10] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[11] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[13] Philip S. Thomas,et al. Safe Reinforcement Learning , 2015 .
[14] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[15] Rachid Guerraoui,et al. Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning , 2017, NIPS.
[16] T. Basar,et al. H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..