Learning Safe Policies via Primal-Dual Methods
暂无分享,去创建一个
Alejandro Ribeiro | Miguel Calvo-Fullana | Luiz F. O. Chamon | Santiago Paternain | Luiz F. O. Chamon | Alejandro Ribeiro | Santiago Paternain | Miguel Calvo-Fullana
[1] D. Bertsekas,et al. Alternative theoretical frameworks for finite horizon discrete-time stochastic optimal control , 1977, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.
[2] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[3] Miklós Rásonyi,et al. ON UTILITY MAXIMIZATION IN DISCRETE-TIME FINANCIAL MARKET MODELS , 2005 .
[4] Alejandro Ribeiro,et al. Stochastic Policy Gradient Ascent in Reproducing Kernel Hilbert Spaces , 2018, IEEE Transactions on Automatic Control.
[5] Fritz Wysotzki,et al. Risk-Sensitive Reinforcement Learning Applied to Control under Constraints , 2005, J. Artif. Intell. Res..
[6] Masami Yasuda,et al. Discounted Markov decision processes with utility constraints , 2006, Comput. Math. Appl..
[7] Daniel Hernández-Hernández,et al. Risk Sensitive Markov Decision Processes , 1997 .
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[10] Steven I. Marcus,et al. Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes , 1999, Autom..
[11] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..
[12] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[13] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[14] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[15] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .
[16] Klaus Obermayer,et al. Risk-Sensitive Reinforcement Learning , 2013, Neural Computation.
[17] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[18] Andreas Krause,et al. Safe Exploration in Finite Markov Decision Processes with Gaussian Processes , 2016, NIPS.
[19] Shie Mannor,et al. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..
[20] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[21] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[22] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[23] Marcus Hutter,et al. Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures , 2002, COLT.
[24] Peter Geibel,et al. Reinforcement Learning for MDPs with Constraints , 2006, ECML.