Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision Processes

We present the conditional value-at-risk (CVaR) in the context of Markov chains and Markov decision processes with reachability and mean-payoff objectives. CVaR quantifies risk by means of the expectation of the worst p-quantile. As such it can be used to design risk-averse systems. We consider not only CVaR constraints, but also introduce their conjunction with expectation constraints and quantile constraints (value-at-risk, VaR). We derive lower and upper bounds on the computational complexity of the respective decision problems and characterize the structure of the strategies in terms of memory and randomization.

[1]  Tanya Styblo Beder,et al.  VAR: Seductive but Dangerous , 1995 .

[2]  Marco Pavone,et al.  Risk aversion in finite Markov Decision Processes using total cost criteria and average value at risk , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[4]  R. Rockafellar,et al.  Optimization of conditional value-at risk , 2000 .

[5]  John F. Canny,et al.  Some algebraic and geometric computations in PSPACE , 1988, STOC '88.

[6]  Krishnendu Chatterjee,et al.  Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[7]  Christel Baier,et al.  Probabilistic Model Checking and Non-standard Multi-objective Reasoning , 2014, FASE.

[8]  Insoon Yang,et al.  Optimal Control of Conditional Value-at-Risk in Continuous Time , 2015, SIAM J. Control. Optim..

[9]  Christel Baier,et al.  Trade-off analysis meets probabilistic model checking , 2014, CSL-LICS.

[10]  Philippe Artzner,et al.  Coherent Measures of Risk , 1999 .

[11]  Hiroe Tsubaki,et al.  Conditional Value-at-Risk for Random Immediate Reward Variables in Markov Decision Processes , 2011, Am. J. Comput. Math..

[12]  Yan Xu,et al.  Optimizing Quantiles in Preference-Based Markov Decision Processes , 2016, AAAI.

[13]  Christel Baier,et al.  Energy-Utility Quantiles , 2014, NASA Formal Methods.

[14]  Krishnendu Chatterjee,et al.  Trading Performance for Stability in Markov Decision Processes , 2013, 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science.

[15]  R. Rockafellar,et al.  Conditional Value-at-Risk for General Loss Distributions , 2001 .

[16]  Krishnendu Chatterjee,et al.  Multi-objective Discounted Reward Verification in Graphs and MDPs , 2013, LPAR.

[17]  Xianping Guo,et al.  Minimum Average Value-at-Risk for Finite Horizon Semi-Markov Decision Processes in Continuous Time , 2016, SIAM J. Optim..

[18]  Véronique Bruyère,et al.  Meet Your Expectations With Guarantees: Beyond Worst-Case Synthesis in Quantitative Games , 2013, STACS.

[19]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[20]  Markus Lohrey,et al.  Computing quantiles in Markov chains with multi-dimensional costs , 2017, 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[21]  Mihalis Yannakakis,et al.  The complexity of probabilistic verification , 1995, JACM.

[22]  Hongyang Qu,et al.  Quantitative Multi-objective Verification for Probabilistic Systems , 2011, TACAS.

[23]  Mickael Randour,et al.  Percentile queries in multi-dimensional Markov decision processes , 2014, CAV.

[24]  Krishnendu Chatterjee,et al.  Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes , 2011, 2011 IEEE 26th Annual Symposium on Logic in Computer Science.

[25]  Christel Baier,et al.  Computing Quantiles in Markov Reward Models , 2013, FoSSaCS.

[26]  Christoph Haase,et al.  The Odds of Staying on Budget , 2014, ICALP.

[27]  Lorenzo Clemente,et al.  Multidimensional beyond Worst-Case and Almost-Sure Problems for Mean-Payoff Objectives , 2015, 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science.

[28]  Christel Baier,et al.  Maximizing the Conditional Expected Reward for Reaching the Goal , 2017, TACAS.

[29]  D. Krass,et al.  Percentile performance criteria for limiting average Markov decision processes , 1995, IEEE Trans. Autom. Control..

[30]  Stan Uryasev,et al.  Conditional Value-at-Risk: Optimization Approach , 2001 .

[31]  Zohar Manna,et al.  Formal verification of probabilistic systems , 1997 .

[32]  Krishnendu Chatterjee,et al.  Value Iteration for Long-Run Average Reward in Markov Decision Processes , 2017, CAV.

[33]  Kousha Etessami,et al.  Multi-Objective Model Checking of Markov Decision Processes , 2007, Log. Methods Comput. Sci..

[34]  Nicole Bäuerle,et al.  Markov Decision Processes with Average-Value-at-Risk criteria , 2011, Math. Methods Oper. Res..

[35]  Margaret L. Brandeau,et al.  Quantile Markov Decision Process , 2017, ArXiv.