论文信息 - Shield Synthesis for Reinforcement Learning - 字舞流文

Shield Synthesis for Reinforcement Learning

Florian Lorber | Nils Jansen | Bettina Könighofer | Roderick Bloem | Florian Lorber | R. Bloem | N. Jansen | Bettina Könighofer

[1] J. A. Anderson,et al. Talking Nets: An Oral History Of Neural Networks , 1998, IEEE Trans. Neural Networks.

[2] Tomás Svoboda,et al. Safe Exploration Techniques for Reinforcement Learning - An Overview , 2014, MESAS.

[3] Rajeev Alur,et al. A Theory of Timed Automata , 1994, Theor. Comput. Sci..

[4] Ufuk Topcu,et al. Safe Reinforcement Learning via Shielding , 2017, AAAI.

[5] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[6] Joseph Sifakis,et al. On the Synthesis of Discrete Controllers for Timed Systems (An Extended Abstract) , 1995, STACS.

[7] Kim G. Larsen,et al. Uppaal Stratego , 2015, TACAS.

[8] Nathan Fulton,et al. Verifiably Safe Off-Model Reinforcement Learning , 2019, TACAS.

[9] Sebastian Junges,et al. A Storm is Coming: A Modern Probabilistic Model Checker , 2017, CAV.

[10] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[11] Kim G. Larsen,et al. Safe and Optimal Adaptive Cruise Control , 2015, Correct System Design.

[12] Kim G. Larsen,et al. It's Time to Play Safe: Shield Synthesis for Timed Systems , 2020, ArXiv.

[13] Christel Baier,et al. Principles of model checking , 2008 .

[14] Chao Wang,et al. Shield Synthesis: Runtime Enforcement for Reactive Systems , 2015, TACAS.

[15] Thierry Jéron,et al. Optimal enforcement of (timed) properties with uncontrollable events , 2017, Mathematical Structures in Computer Science.

[16] Sebastian Junges,et al. Shielded Decision-Making in MDPs , 2018, ArXiv.

[17] Radu Calinescu,et al. Assured Reinforcement Learning with Formally Verified Abstract Policies , 2017, ICAART.

[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19] Amir Pnueli,et al. The temporal logic of programs , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[20] Yliès Falcone,et al. On the Runtime Enforcement of Timed Properties , 2019, RV.

[21] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.

[22] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .