Effect of Reward Function Choices in MDPs with Value-at-Risk
暂无分享,去创建一个
[1] S. Meyn,et al. Spectral theory and limit theorems for geometrically ergodic Markov processes , 2002, math/0209200.
[2] Shie Mannor,et al. Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..
[3] D. White. Mean, variance, and probabilistic criteria in finite Markov decision processes: A review , 1988 .
[4] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[5] Olivier Buffet,et al. Revisiting Goal Probability Analysis in Probabilistic Planning , 2016, ICAPS.
[6] Hector Geffner,et al. Heuristic Search for Generalized Stochastic Shortest Path MDPs , 2011, ICAPS.
[7] Ping Hou,et al. Revisiting Risk-Sensitive MDPs: New Algorithms and Results , 2014, ICAPS.
[8] P. Glynn. A Lyapunov Bound for Solutions of Poisson's Equation , 1989 .
[9] E. Altman. Constrained Markov Decision Processes , 1999 .
[10] Michael C. Fu,et al. Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control , 2015, ICML.
[11] Yoshio Ohtsubo,et al. Optimal policy for minimizing risk models in Markov decision processes , 2002 .
[12] Shie Mannor,et al. Probabilistic Goal Markov Decision Processes , 2011, IJCAI.
[13] Miguel A. Lejeune,et al. An Exact Solution Approach for Portfolio Optimization Problems Under Stochastic and Integer Constraints , 2009, Oper. Res..
[14] Jia Yuan Yu,et al. Central-limit approach to risk-aware Markov decision processes , 2015, ArXiv.
[15] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.
[16] Harry Zheng. Efficient frontier of utility and CVaR , 2009, Math. Methods Oper. Res..
[17] Mickael Randour,et al. Percentile queries in multi-dimensional Markov decision processes , 2014, CAV.
[18] Louis Wehenkel,et al. Risk-aware decision making and dynamic programming , 2008 .
[19] John N. Tsitsiklis,et al. Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.
[20] T. Vorst. Optimal Portfolios under a Value at Risk Constraint , 2001 .
[21] Peng Dai,et al. Topological Value Iteration Algorithms , 2011, J. Artif. Intell. Res..
[22] Matthew J. Sobel,et al. Mean-Variance Tradeoffs in an Undiscounted MDP , 1994, Oper. Res..
[23] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[24] Frank Riedel,et al. Dynamic Coherent Risk Measures , 2003 .
[25] Jerzy A. Filar,et al. Time Consistent Dynamic Risk Measures , 2006, Math. Methods Oper. Res..
[26] Stella X. Yu,et al. Optimization Models for the First Arrival Target Distribution Function in Discrete Time , 1998 .
[27] M. Bouakiz,et al. Target-level criterion in Markov decision processes , 1995 .
[28] Cyrus Derman,et al. Finite State Markovian Decision Processes , 1970 .
[29] Akifumi Kira,et al. Threshold probability of non-terminal type in finite horizon Markov decision processes , 2012 .
[30] Michael C. Fu,et al. Cumulative Prospect Theory Meets Reinforcement Learning: Estimation and Control , 2015, ArXiv.
[31] D. Krass,et al. Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..
[32] Congbin Wu,et al. Minimizing risk models in Markov decision processes with policies depending on target values , 1999 .
[33] Alexander Shapiro,et al. Optimization of Convex Risk Functions , 2006, Math. Oper. Res..
[34] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..