A formal methods approach to interpretable reinforcement learning for robotic planning
暂无分享,去创建一个
Calin Belta | Xiao Li | Guang Yang | Zachary T. Serlin | Zachary Serlin | C. Belta | Guang Yang | Xiao Li
[1] Cuntai Guan,et al. A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[2] Surya P. N. Singh,et al. V-REP: A versatile and scalable robot simulation framework , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[3] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[4] Scott Sanner,et al. Non-Markovian Rewards Expressed in LTL: Guiding Search Via Reward Shaping , 2021, SOCS.
[5] Craig Boutilier,et al. Structured Solution Methods for Non-Markovian Decision Processes , 1997, AAAI/IAAI.
[6] Armando Solar-Lezama,et al. Verifiable Reinforcement Learning via Policy Extraction , 2018, NeurIPS.
[7] Dejan Nickovic,et al. Monitoring Temporal Properties of Continuous Signals , 2004, FORMATS/FTRTFT.
[8] Gábor Orosz,et al. End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks , 2019, AAAI.
[9] John K. Slaney,et al. Decision-Theoretic Planning with non-Markovian Rewards , 2011, J. Artif. Intell. Res..
[10] Matthias Scheutz,et al. Value Alignment or Misalignment - What Will Keep Systems Accountable? , 2017, AAAI Workshops.
[11] Christel Baier,et al. Principles of model checking , 2008 .
[12] Giuseppe De Giacomo,et al. Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf Restraining Specifications , 2018, ICAPS.
[13] Dario Amodei,et al. Supervising strong learners by amplifying weak experts , 2018, ArXiv.
[14] Calin Belta,et al. Reinforcement learning with temporal logic rewards , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[15] Koushil Sreenath,et al. Discrete Control Barrier Functions for Safety-Critical Control of Discrete Systems with Application to Bipedal Robot Navigation , 2017, Robotics: Science and Systems.
[16] Sven Schewe,et al. Omega-Regular Objectives in Model-Free Reinforcement Learning , 2018, TACAS.
[17] Shane Legg,et al. Scalable agent alignment via reward modeling: a research direction , 2018, ArXiv.
[18] Radu Calinescu,et al. Assured Reinforcement Learning with Formally Verified Abstract Policies , 2017, ICAART.
[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[20] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[21] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[22] Jyotirmoy V. Deshmukh,et al. Structured reward functions using STL: poster abstract , 2019, HSCC.
[23] Michael M. Zavlanos,et al. Reduced variance deep reinforcement learning with temporal logic specifications , 2019, ICCPS.
[24] Calin Belta,et al. Receding horizon surveillance with temporal logic specifications , 2010, 49th IEEE Conference on Decision and Control (CDC).
[25] Ufuk Topcu,et al. Safe Reinforcement Learning via Shielding , 2017, AAAI.
[26] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[27] Ufuk Topcu,et al. Environment-Independent Task Specifications via GLTL , 2017, ArXiv.
[28] Amina Adadi,et al. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.
[29] Craig Boutilier,et al. Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.
[30] Calin Belta,et al. Q-Learning for robust satisfaction of signal temporal logic specifications , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).
[31] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[32] Moshe Y. Vardi,et al. Explicit or symbolic translation of linear temporal logic to automata , 2012 .
[33] Timo Latvala,et al. Efficient Model Checking of Safety Properties , 2003, SPIN.
[34] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[35] Paulo Tabuada,et al. Control barrier function based quadratic programs with application to adaptive cruise control , 2014, 53rd IEEE Conference on Decision and Control.
[36] Li Wang,et al. Barrier-Certified Adaptive Reinforcement Learning With Applications to Brushbot Navigation , 2018, IEEE Transactions on Robotics.
[37] Kiran Vodrahalli,et al. Learning to Plan with Logical Automata , 2019, Robotics: Science and Systems.
[38] Sheila A. McIlraith,et al. Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning , 2018, ICML.
[39] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[40] Petter Nilsson,et al. Barrier Functions: Bridging the Gap between Planning from Specifications and Safety-Critical Control , 2018, 2018 IEEE Conference on Decision and Control (CDC).
[41] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[42] Ufuk Topcu,et al. Learning from Demonstrations with High-Level Side Information , 2017, IJCAI.