论文信息 - Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes

Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes

We present the conditional value-at-risk (CVaR) in the context of Markov chains and Markov decision processes with reachability and mean-payoff objectives. CVaR quantifies risk by means of the expectation of the worst p-quantile. As such it can be used to design risk-averse systems. We consider not only CVaR constraints, but also introduce their conjunction with expectation constraints and quantile constraints (value-at-risk, VaR). We derive lower and upper bounds on the computational complexity of the respective decision problems and characterize the structure of the strategies in terms of memory and randomization.

Jan Kretínský | Tobias Meggendorfer

[1] Tanya Styblo Beder,et al. VAR: Seductive but Dangerous , 1995 .

[2] Marco Pavone,et al. Risk aversion in finite Markov Decision Processes using total cost criteria and average value at risk , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[4] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .

[5] John F. Canny,et al. Some algebraic and geometric computations in PSPACE , 1988, STOC '88.

[6] Krishnendu Chatterjee,et al. Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[7] Christel Baier,et al. Probabilistic Model Checking and Non-standard Multi-objective Reasoning , 2014, FASE.

[8] Insoon Yang,et al. Optimal Control of Conditional Value-at-Risk in Continuous Time , 2015, SIAM J. Control. Optim..

[9] Christel Baier,et al. Trade-off analysis meets probabilistic model checking , 2014, CSL-LICS.

[10] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .

[11] Hiroe Tsubaki,et al. Conditional Value-at-Risk for Random Immediate Reward Variables in Markov Decision Processes , 2011, Am. J. Comput. Math..

[12] Yan Xu,et al. Optimizing Quantiles in Preference-Based Markov Decision Processes , 2016, AAAI.

[13] Christel Baier,et al. Energy-Utility Quantiles , 2014, NASA Formal Methods.

[14] Krishnendu Chatterjee,et al. Trading Performance for Stability in Markov Decision Processes , 2013, 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science.

[15] R. Rockafellar,et al. Conditional Value-at-Risk for General Loss Distributions , 2001 .

[16] Krishnendu Chatterjee,et al. Multi-objective Discounted Reward Verification in Graphs and MDPs , 2013, LPAR.

[17] Xianping Guo,et al. Minimum Average Value-at-Risk for Finite Horizon Semi-Markov Decision Processes in Continuous Time , 2016, SIAM J. Optim..

[18] Véronique Bruyère,et al. Meet Your Expectations With Guarantees: Beyond Worst-Case Synthesis in Quantitative Games , 2013, STACS.

[19] U. Rieder,et al. Markov Decision Processes , 2010 .

[20] Markus Lohrey,et al. Computing quantiles in Markov chains with multi-dimensional costs , 2017, 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[21] Mihalis Yannakakis,et al. The complexity of probabilistic verification , 1995, JACM.

[22] Hongyang Qu,et al. Quantitative Multi-objective Verification for Probabilistic Systems , 2011, TACAS.

[23] Mickael Randour,et al. Percentile queries in multi-dimensional Markov decision processes , 2014, CAV.

[24] Krishnendu Chatterjee,et al. Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes , 2011, 2011 IEEE 26th Annual Symposium on Logic in Computer Science.

[25] Christel Baier,et al. Computing Quantiles in Markov Reward Models , 2013, FoSSaCS.

[26] Christoph Haase,et al. The Odds of Staying on Budget , 2014, ICALP.

[27] Lorenzo Clemente,et al. Multidimensional beyond Worst-Case and Almost-Sure Problems for Mean-Payoff Objectives , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[28] Christel Baier,et al. Maximizing the Conditional Expected Reward for Reaching the Goal , 2017, TACAS.

[29] D. Krass,et al. Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..

[30] Stan Uryasev,et al. Conditional Value-at-Risk: Optimization Approach , 2001 .

[31] Zohar Manna,et al. Formal verification of probabilistic systems , 1997 .

[32] Krishnendu Chatterjee,et al. Value Iteration for Long-Run Average Reward in Markov Decision Processes , 2017, CAV.

[33] Kousha Etessami,et al. Multi-Objective Model Checking of Markov Decision Processes , 2007, Log. Methods Comput. Sci..

[34] Nicole Bäuerle,et al. Markov Decision Processes with Average-Value-at-Risk criteria , 2011, Math. Methods Oper. Res..

[35] Margaret L. Brandeau,et al. Quantile Markov Decision Process , 2017, ArXiv.