Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models
暂无分享,去创建一个
Dorsa Sadigh | Erdem Biyik | Jonathan Margoliash | Shahrouz Ryan Alimo | Dorsa Sadigh | Erdem Biyik | S. R. Alimo | Jonathan Margoliash
[1] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[2] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[3] Steven I. Marcus,et al. Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes , 1999, Autom..
[4] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[5] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.
[6] Ashish Kapoor,et al. Safe Control under Uncertainty with Probabilistic Signal Temporal Logic , 2016, Robotics: Science and Systems.
[7] Andreas Krause,et al. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics , 2016, Machine Learning.
[8] S. Shankar Sastry,et al. Provably safe and robust learning-based model predictive control , 2011, Autom..
[9] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[10] Makoto Sato,et al. TD algorithm for the variance of return and mean-variance reinforcement learning , 2001 .
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[13] Andreas Krause,et al. Safe Exploration in Finite Markov Decision Processes with Gaussian Processes , 2016, NIPS.
[14] Claire J. Tomlin,et al. Guaranteed safe online learning of a bounded system , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[15] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[16] Paulo Tabuada,et al. Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.
[17] Yisong Yue,et al. Safe Exploration and Optimization of Constrained MDPs Using Gaussian Processes , 2018, AAAI.
[18] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[19] Ashish Kapoor,et al. Fast Safe Mission Plans for Autonomous Vehicles , 2016 .
[20] Javier García,et al. Safe Exploration of State and Action Spaces in Reinforcement Learning , 2012, J. Artif. Intell. Res..
[21] Pooriya Beyhaghi,et al. Optimization combining derivative-free global exploration with derivative-based local refinement , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).
[22] Jaime F. Fisac,et al. Reachability-based safe learning with Gaussian processes , 2014, 53rd IEEE Conference on Decision and Control.
[23] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[24] Chris Gaskett,et al. Reinforcement learning under circumstances beyond its control , 2003 .
[25] Vivek S. Borkar,et al. Q-Learning for Risk-Sensitive Control , 2002, Math. Oper. Res..