Variance-Based Risk Estimations in Markov Processes via Transformation with State Lumping
暂无分享,去创建一个
[1] D. White. Mean, variance, and probabilistic criteria in finite Markov decision processes: A review , 1988 .
[2] Jia Yuan Yu,et al. State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning , 2019, AAAI.
[3] Wenjie Huang,et al. Risk-aware Q-learning for Markov decision processes , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).
[4] Balaraman Ravindran,et al. Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.
[5] Klaus Obermayer,et al. Risk-Sensitive Reinforcement Learning , 2013, Neural Computation.
[6] M. Rosenblatt,et al. A MARKOVIAN FUNCTION OF A MARKOV CHAIN , 1958 .
[7] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[8] Andrzej Ruszczynski,et al. Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..
[9] R. Howard,et al. Risk-Sensitive Markov Decision Processes , 1972 .
[10] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[11] Li Xia. Mean-variance optimization of discrete time discounted Markov decision processes , 2018, Autom..
[12] John N. Tsitsiklis,et al. Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.
[13] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..
[14] S. Kusuoka. On law invariant coherent risk measures , 2001 .
[15] Tsan-Ming Choi,et al. Mean–Variance Analysis for the Newsvendor Problem , 2008, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.
[16] Andreas Krause,et al. Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.
[17] Peter G. Harrison,et al. Performance modelling of communication networks and computer architectures , 1992, International computer science series.
[18] Matthew J. Sobel,et al. Mean-Variance Tradeoffs in an Undiscounted MDP , 1994, Oper. Res..
[19] M. J. Sobel,et al. Discounted MDP's: distribution functions and exponential utility maximization , 1987 .
[20] E. Altman. Constrained Markov Decision Processes , 1999 .
[21] Vivek S. Borkar,et al. Q-Learning for Risk-Sensitive Control , 2002, Math. Oper. Res..
[22] M. J. Sobel. The variance of discounted Markov decision processes , 1982 .
[23] Tsan-Ming Choi,et al. Supply chain risk analysis with mean-variance models: a technical review , 2016, Ann. Oper. Res..
[24] Hon-Shiang Lau. The Newsboy Problem under Alternative Optimization Objectives , 1980 .