Risk-Sensitive Markov Control Processes

We introduce a general framework for measuring risk in the context of Markov control processes with risk maps on general Borel spaces that generalize known concepts of risk measures in mathematical finance, operations research, and behavioral economics. Within the framework, applying weighted norm spaces to incorporate unbounded costs also, we study two types of infinite-horizon risk-sensitive criteria, discounted total risk and average risk, and solve the associated optimization problems by dynamic programming. For the discounted case, we propose a new discount scheme, which is different from the conventional form but consistent with the existing literature, while for the average risk criterion, we state Lyapunov-like stability conditions that generalize known conditions for Markov chains to ensure the existence of solutions to the optimality equation.

[1]  Hans Föllmer,et al.  Risk assessment for uncertain cash flows: model ambiguity, discounting ambiguity, and the role of bubbles , 2010, Finance Stochastics.

[2]  Patrick Cheridito,et al.  COMPOSITION OF TIME-CONSISTENT DYNAMIC MONETARY RISK MEASURES IN DISCRETE TIME , 2011 .

[3]  John N. Tsitsiklis,et al.  Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.

[4]  Jonathan C. Mattingly,et al.  Yet Another Look at Harris’ Ergodic Theorem for Markov Chains , 2008, 0810.2777.

[5]  Andrzej Ruszczynski,et al.  Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..

[6]  Rolando Cavazos-Cadena,et al.  Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains , 2010, Math. Methods Oper. Res..

[7]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[8]  Gregor Svindland,et al.  Subgradients of Law-Invariant Convex Risk Measures on L1 , 2009 .

[9]  Alain Chateauneuf,et al.  Cardinal Extensions of the EU Model Based on the Choquet Integral , 2009, Decision-making Process.

[10]  Gregor Svindland,et al.  Convex Risk Measures Beyond Bounded Risks , 2008 .

[11]  Leonard Rogers,et al.  VALUATIONS AND DYNAMIC CONVEX RISK MEASURES , 2007, 0709.0232.

[12]  Lukasz Stettner,et al.  Infinite Horizon Risk Sensitive Control of Discrete Time Markov Processes under Minorization Property , 2007, SIAM J. Control. Optim..

[13]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[14]  Abhijit Gosavi,et al.  A risk-sensitive approach to total productive maintenance , 2006, Autom..

[15]  H. Föllmer,et al.  Convex risk measures and the dynamics of their penalty functions , 2006 .

[16]  Alexander Shapiro,et al.  Conditional Risk Mappings , 2005, Math. Oper. Res..

[17]  Garud Iyengar,et al.  Robust Dynamic Programming , 2005, Math. Oper. Res..

[18]  Giacomo Scandolo,et al.  Conditional and dynamic convex risk measures , 2005, Finance Stochastics.

[19]  B. Roorda,et al.  COHERENT ACCEPTABILITY MEASURES IN MULTIPERIOD MODELS , 2005 .

[20]  F. Delbaen,et al.  Dynamic Monetary Risk Measures for Bounded Discrete-Time Processes , 2004, math/0410453.

[21]  J. Chavas Risk analysis in theory and practice , 2004 .

[22]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[23]  Oscar Vega-Amaya,et al.  The average cost optimality equation: A fixed point approach , 2003 .

[24]  Glyn A. Holton Value at Risk: Theory and Practice , 2003 .

[25]  Alexander Schied,et al.  Convex measures of risk and trading constraints , 2002, Finance Stochastics.

[26]  Sean P. Meyn,et al.  Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost , 2002, Math. Oper. Res..

[27]  F. Delbaen Coherent Risk Measures on General Probability Spaces , 2002 .

[28]  S. Gaubert,et al.  The Perron-Frobenius theorem for homogeneous, monotone functions , 2001, math/0105091.

[29]  R. Rockafellar,et al.  Conditional Value-at-Risk for General Loss Distributions , 2001 .

[30]  C. Gollier The economics of risk and time , 2001 .

[31]  C. Starmer Developments in Non-expected Utility Theory: The Hunt for a Descriptive Theory of Choice under Risk , 2000 .

[32]  Steven I. Marcus,et al.  Mixed risk-neutral/minimax control of discrete-time, finite-state Markov decision processes , 2000, IEEE Trans. Autom. Control..

[33]  Lukasz Stettner,et al.  Risk-Sensitive Control of Discrete-Time Markov Processes with Infinite Horizon , 1999, SIAM J. Control. Optim..

[34]  Wlodzimierz Ogryczak,et al.  From stochastic dominance to mean-risk models: Semideviations as risk measures , 1999, Eur. J. Oper. Res..

[35]  Philippe Artzner,et al.  Coherent Measures of Risk , 1999 .

[36]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[37]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[38]  E. Fernandez-Gaucherand,et al.  Controlled Markov chains with exponential risk-sensitive criteria: modularity, structured policies and applications , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[39]  W. Fleming Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .

[40]  Daniel Hernández-Hernández,et al.  Risk Sensitive Markov Decision Processes , 1997 .

[41]  S. Marcus,et al.  Risk sensitive control of Markov processes in countable state space , 1996 .

[42]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[43]  Ying Huang,et al.  On Finding Optimal Policies for Markov Decision Chains: A Unifying Framework for Mean-Variance-Tradeoffs , 1994, Math. Oper. Res..

[44]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[45]  D. Denneberg Non-additive measure and integral , 1994 .

[46]  M. K. Ghosh,et al.  Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[47]  A. Tversky,et al.  Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[48]  O. Hernández-Lerma,et al.  Recurrence conditions for Markov decision processes with Borel state space: A survey , 1991 .

[49]  Jerzy A. Filar,et al.  Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..

[50]  O. Hernondex-lerma,et al.  Adaptive Markov Control Processes , 1989 .

[51]  R. Nussbaum Hilbert's Projective Metric and Iterated Nonlinear Maps , 1988 .

[52]  M. J. Sobel,et al.  Discounted MDP's: distribution functions and exponential utility maximization , 1987 .

[53]  M. J. Sobel The variance of discounted Markov decision processes , 1982, Journal of Applied Probability.

[54]  A. Tversky,et al.  Prospect theory: analysis of decision under risk , 1979 .

[55]  L. J. Savage,et al.  The Foundations of Statistics , 1955 .

[56]  G. Choquet Theory of capacities , 1954 .