A Convex Analytic Approach to Risk-Aware Markov Decision Processes

In classical Markov decision process (MDP) theory, we search for a policy that, say, minimizes the expected infinite horizon discounted cost. Expectation is, of course, a risk neutral measure, which does not suffice in many applications, particularly in finance. We replace the expectation with a general risk functional, and call such models risk-aware MDP models. We consider minimization of such risk functionals in two cases, the expected utility framework, and conditional value-at-risk, a popular coherent risk measure. Later, we consider risk-aware MDPs wherein the risk is expressed in the constraints. This includes stochastic dominance constraints, and the classical chance-constrained optimization problems. In each case, we develop a convex analytic approach to solve such risk-aware MDPs. In most cases, we show that the problem can be formulated as an infinite-dimensional linear program (LP) in occupation measures when we augment the state space. We provide a discretization method and finite approximati...

[1]  Jerzy A. Filar,et al.  Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Onésimo Hernández-Lerma,et al.  Constrained Markov control processes in Borel spaces: the discounted case , 2000, Math. Methods Oper. Res..

[4]  Darinka Dentcheva,et al.  Optimization with Stochastic Dominance Constraints , 2003, SIAM J. Optim..

[5]  András Prékopa,et al.  ON PROBABILISTIC CONSTRAINED PROGRAMMING , 2015 .

[6]  R. Rockafellar,et al.  Optimization of conditional value-at risk , 2000 .

[7]  David M. Kreps Decision Problems with Expected Utility Critera, I: Upper and Lower Convergent Utility , 1977, Math. Oper. Res..

[8]  A. S. Manne Linear Programming and Sequential Decisions , 1960 .

[9]  A. Piunovskiy Optimal Control of Random Sequences in Problems with Constraints , 1997 .

[10]  David M. Kreps Decision Problems with Expected Utility Criteria, II: Stationarity , 1977, Math. Oper. Res..

[11]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[12]  Lukasz Stettner,et al.  Risk-Sensitive Control of Discrete-Time Markov Processes with Infinite Horizon , 1999, SIAM J. Control. Optim..

[13]  Marco Frittelli,et al.  RISK MEASURES ON P(R) AND VALUE AT RISK WITH PROBABILITY/LOSS FUNCTION , 2012, 1201.2257.

[14]  Buheeerdun Yang Conditional Value-at-Risk Minimization in Finite State Markov Decision Processes : Continuity and Compactness , 2013 .

[15]  J. Lasserre Moments, Positive Polynomials And Their Applications , 2009 .

[16]  V. Borkar A convex analytic approach to Markov decision processes , 1988 .

[17]  Darinka Dentcheva,et al.  Optimality and duality theory for stochastic optimization problems with nonlinear dominance constraints , 2004, Math. Program..

[18]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[19]  Onésimo Hernández-Lerma,et al.  Approximation Schemes for Infinite Linear Programs , 1998, SIAM J. Optim..

[20]  Nicole Bäuerle,et al.  More Risk-Sensitive Markov Decision Processes , 2014, Math. Oper. Res..

[21]  Özlem Çavus,et al.  Computational Methods for Risk-Averse Undiscounted Transient Markov Models , 2014, Oper. Res..

[22]  Andrzej Ruszczynski,et al.  Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..

[23]  Nicole Bäuerle,et al.  Markov Decision Processes with Average-Value-at-Risk criteria , 2011, Math. Methods Oper. Res..

[24]  Tanja Neumann Mean Variance Analysis In Portfolio Choice And Capital Markets , 2016 .

[25]  Vivek S. Borkar,et al.  Risk-Constrained Markov Decision Processes , 2010, IEEE Trans. Autom. Control..

[26]  Onésimo Hernández-Lerma,et al.  Constrained Average Cost Markov Control Processes in Borel Spaces , 2003, SIAM J. Control. Optim..

[27]  E. Altman Constrained Markov Decision Processes , 1999 .

[28]  William B. Haskell,et al.  Stochastic Dominance-Constrained Markov Decision Processes , 2013, SIAM J. Control. Optim..

[29]  Vivek S. Borkar,et al.  Convex Analytic Methods in Markov Decision Processes , 2002 .

[30]  S. Kusuoka On law invariant coherent risk measures , 2001 .

[31]  Darinka Dentcheva,et al.  Optimization with multivariate stochastic dominance constraints , 2008, SIAM J. Optim..

[32]  Terry J. Lyons,et al.  Stochastic finance. an introduction in discrete time , 2004 .

[33]  A. Müller,et al.  Comparison Methods for Stochastic Models and Risks , 2002 .

[34]  W. Fleming Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .

[35]  John S. Edwards,et al.  Linear Programming and Finite Markovian Control Problems , 1983 .

[36]  Matthew J. Sobel,et al.  Mean-Variance Tradeoffs in an Undiscounted MDP , 1994, Oper. Res..

[37]  Gautam Appa,et al.  Linear Programming in Infinite-Dimensional Spaces , 1989 .

[38]  Alexander Shapiro,et al.  Optimization of Convex Risk Functions , 2006, Math. Oper. Res..

[39]  Søren Johansen The Extremal Convex Functions. , 1974 .