Safe Exploration Techniques for Reinforcement Learning - An Overview
暂无分享,去创建一个
[1] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[2] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[4] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[5] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[6] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.
[7] Steven I. Marcus,et al. Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes , 1999, Autom..
[8] Peter Geibel,et al. Reinforcement Learning with Bounded Risk , 2001, ICML.
[9] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[10] Ralph Neuneier,et al. Risk-Sensitive Reinforcement Learning , 1998, Machine Learning.
[11] Richard S. Sutton,et al. Associative search network: A reinforcement learning associative memory , 1981, Biological Cybernetics.
[12] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[13] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[14] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[15] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[16] Shie Mannor,et al. Percentile optimization in uncertain Markov decision processes with application to efficient exploration , 2007, ICML '07.
[17] Manuela M. Veloso,et al. Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.
[18] Steffen Udluft,et al. Safe exploration for reinforcement learning , 2008, ESANN.
[19] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[20] Francisco Javier García-Polo,et al. Safe reinforcement learning in high-risk tasks through policy improvement , 2011, ADPRL.
[21] Claire J. Tomlin,et al. Guaranteed safe online learning of a bounded system , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] Dirk Söffker,et al. Towards learning of safety knowledge from human demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[23] Javier García,et al. Safe Exploration of State and Action Spaces in Reinforcement Learning , 2012, J. Artif. Intell. Res..
[24] Kee-Eung Kim,et al. Cost-Sensitive Exploration in Bayesian Reinforcement Learning , 2012, NIPS.
[25] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[26] Phillipp Bergmann. Dynamic Programming Deterministic And Stochastic Models , 2016 .