State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning
暂无分享,去创建一个
[1] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[2] Jia Yuan Yu,et al. Effect of Reward Function Choices in MDPs with Value-at-Risk , 2016, 1612.02088.
[3] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.
[4] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.
[5] J. Durbin. Distribution theory for tests based on the sample distribution function , 1973 .
[6] Andrzej Ruszczynski,et al. Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..
[7] Matthew J. Sobel,et al. Mean-Variance Tradeoffs in an Undiscounted MDP , 1994, Oper. Res..
[8] Frank Riedel,et al. Dynamic Coherent Risk Measures , 2003 .
[9] E. Altman. Constrained Markov Decision Processes , 1999 .
[10] Michael C. Fu,et al. Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control , 2015, ICML.
[11] Vivek S. Borkar,et al. Q-Learning for Risk-Sensitive Control , 2002, Math. Oper. Res..
[12] Danna Zhou,et al. d. , 1934, Microbial pathogenesis.
[13] J. Hess,et al. Analysis of variance , 2018, Transfusion.
[14] M. Woodroofe. A central limit theorem for functions of a Markov chain with applications to shifts , 1992 .
[15] M. J. Sobel. The variance of discounted Markov decision processes , 1982 .
[16] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[17] Sebastian Junges,et al. Safety-Constrained Reinforcement Learning for MDPs , 2015, TACAS.
[18] Wenjie Huang,et al. Risk-aware Q-learning for Markov decision processes , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).
[19] Tsuyoshi Murata,et al. {m , 1934, ACML.
[20] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[21] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[22] Paul Weng,et al. Quantile Reinforcement Learning , 2016, ArXiv.
[23] S. Kusuoka. On law invariant coherent risk measures , 2001 .
[24] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[25] D. Krass,et al. Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..
[26] Koichiro Yamauchi,et al. Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget , 2016, ICONIP.
[27] John N. Tsitsiklis,et al. Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.
[28] D. White. Mean, variance, and probabilistic criteria in finite Markov decision processes: A review , 1988 .
[29] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[30] Congbin Wu,et al. Minimizing risk models in Markov decision processes with policies depending on target values , 1999 .
[31] Alexander Shapiro,et al. Optimization of Convex Risk Functions , 2006, Math. Oper. Res..