Shield Synthesis for Reinforcement Learning
暂无分享,去创建一个
Florian Lorber | Nils Jansen | Bettina Könighofer | Roderick Bloem | Florian Lorber | R. Bloem | N. Jansen | Bettina Könighofer
[1] J. A. Anderson,et al. Talking Nets: An Oral History Of Neural Networks , 1998, IEEE Trans. Neural Networks.
[2] Tomás Svoboda,et al. Safe Exploration Techniques for Reinforcement Learning - An Overview , 2014, MESAS.
[3] Rajeev Alur,et al. A Theory of Timed Automata , 1994, Theor. Comput. Sci..
[4] Ufuk Topcu,et al. Safe Reinforcement Learning via Shielding , 2017, AAAI.
[5] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[6] Joseph Sifakis,et al. On the Synthesis of Discrete Controllers for Timed Systems (An Extended Abstract) , 1995, STACS.
[7] Kim G. Larsen,et al. Uppaal Stratego , 2015, TACAS.
[8] Nathan Fulton,et al. Verifiably Safe Off-Model Reinforcement Learning , 2019, TACAS.
[9] Sebastian Junges,et al. A Storm is Coming: A Modern Probabilistic Model Checker , 2017, CAV.
[10] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[11] Kim G. Larsen,et al. Safe and Optimal Adaptive Cruise Control , 2015, Correct System Design.
[12] Kim G. Larsen,et al. It's Time to Play Safe: Shield Synthesis for Timed Systems , 2020, ArXiv.
[13] Christel Baier,et al. Principles of model checking , 2008 .
[14] Chao Wang,et al. Shield Synthesis: Runtime Enforcement for Reactive Systems , 2015, TACAS.
[15] Thierry Jéron,et al. Optimal enforcement of (timed) properties with uncontrollable events , 2017, Mathematical Structures in Computer Science.
[16] Sebastian Junges,et al. Shielded Decision-Making in MDPs , 2018, ArXiv.
[17] Radu Calinescu,et al. Assured Reinforcement Learning with Formally Verified Abstract Policies , 2017, ICAART.
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Amir Pnueli,et al. The temporal logic of programs , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).
[20] Yliès Falcone,et al. On the Runtime Enforcement of Timed Properties , 2019, RV.
[21] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[22] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .