Extreme Risk Averse Policy for Goal-Directed Risk-Sensitive Markov Decision Process
暂无分享,去创建一个
[1] S. C. Jaquette. A Utility Criterion for Markov Decision Processes , 1976 .
[2] R. Howard,et al. Risk-Sensitive Markov Decision Processes , 1972 .
[3] F. B. Vernadat,et al. Decisions with Multiple Objectives: Preferences and Value Tradeoffs , 1994 .
[4] D. Krass,et al. Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..
[5] Uriel G. Rothblum,et al. Optimal stopping, exponential utility, and linear programming , 1979, Math. Program..
[6] Vivek S. Borkar,et al. A sensitivity formula for risk-sensitive cost and the actor-critic algorithm , 2001, Syst. Control. Lett..
[7] Ralph L. Keeney,et al. Decisions with multiple objectives: preferences and value tradeoffs , 1976 .
[8] Stella X. Yu,et al. Optimization Models for the First Arrival Target Distribution Function in Discrete Time , 1998 .
[9] Jerzy A. Filar,et al. Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..
[10] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[11] Ping Hou,et al. Revisiting Risk-Sensitive MDPs: New Algorithms and Results , 2014, ICAPS.
[12] R. L. Keeney,et al. Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[13] Andrzej Ruszczynski,et al. Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..
[14] Blai Bonet,et al. A Concise Introduction to Models and Methods for Automated Planning , 2013, A Concise Introduction to Models and Methods for Automated Planning.
[15] Vivek S. Borkar,et al. Q-Learning for Risk-Sensitive Control , 2002, Math. Oper. Res..
[16] Mohammad Ghavamzadeh,et al. Actor-Critic Algorithms for Risk-Sensitive MDPs , 2013, NIPS.
[17] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[18] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[19] Stephen D. Patek,et al. On terminating Markov decision processes with a risk-averse objective function , 2001, Autom..
[20] Uriel G. Rothblum,et al. Multiplicative Markov Decision Chains , 1984, Math. Oper. Res..
[21] Ping Hou,et al. Solving Risk-Sensitive POMDPs With and Without Cost Observations , 2016, AAAI.
[22] Karel Sladký,et al. Growth rates and average optimality in risk-sensitive Markov decision chains , 2008, Kybernetika.
[23] M. J. Sobel. The variance of discounted Markov decision processes , 1982 .
[24] Valdinei Freire da Silva,et al. Shortest Stochastic Path with Risk Sensitive Evaluation , 2012, MICAI.