Shield Synthesis for Reinforcement Learning

[1]  J. A. Anderson,et al.  Talking Nets: An Oral History Of Neural Networks , 1998, IEEE Trans. Neural Networks.

[2]  Tomás Svoboda,et al.  Safe Exploration Techniques for Reinforcement Learning - An Overview , 2014, MESAS.

[3]  Rajeev Alur,et al.  A Theory of Timed Automata , 1994, Theor. Comput. Sci..

[4]  Ufuk Topcu,et al.  Safe Reinforcement Learning via Shielding , 2017, AAAI.

[5]  Javier García,et al.  A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..

[6]  Joseph Sifakis,et al.  On the Synthesis of Discrete Controllers for Timed Systems (An Extended Abstract) , 1995, STACS.

[7]  Kim G. Larsen,et al.  Uppaal Stratego , 2015, TACAS.

[8]  Nathan Fulton,et al.  Verifiably Safe Off-Model Reinforcement Learning , 2019, TACAS.

[9]  Sebastian Junges,et al.  A Storm is Coming: A Modern Probabilistic Model Checker , 2017, CAV.

[10]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[11]  Kim G. Larsen,et al.  Safe and Optimal Adaptive Cruise Control , 2015, Correct System Design.

[12]  Kim G. Larsen,et al.  It's Time to Play Safe: Shield Synthesis for Timed Systems , 2020, ArXiv.

[13]  Christel Baier,et al.  Principles of model checking , 2008 .

[14]  Chao Wang,et al.  Shield Synthesis: Runtime Enforcement for Reactive Systems , 2015, TACAS.

[15]  Thierry Jéron,et al.  Optimal enforcement of (timed) properties with uncontrollable events , 2017, Mathematical Structures in Computer Science.

[16]  Sebastian Junges,et al.  Shielded Decision-Making in MDPs , 2018, ArXiv.

[17]  Radu Calinescu,et al.  Assured Reinforcement Learning with Formally Verified Abstract Policies , 2017, ICAART.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  Amir Pnueli,et al.  The temporal logic of programs , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[20]  Yliès Falcone,et al.  On the Runtime Enforcement of Timed Properties , 2019, RV.

[21]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[22]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .