The vanishing discount approach in Markov chains with risk-sensitive criteria

In this paper stochastic dynamic systems are studied, modeled by a countable state space Markov cost/reward chain, satisfying a Lyapunov-type stability condition. For an infinite planning horizon, risk-sensitive (exponential) discounted and average cost criteria are considered. The main contribution is the development of a vanishing discount approach to relate the discounted criterion problem with the average criterion one, as the discount factor increases to one, i.e., no discounting. In comparison to the well-established risk-neutral case, our results are novel and reveal several fundamental and surprising differences. Other contributions made include the use of convex analytic arguments to obtain appropriately convergent sequences and a verification theorem for the case of unbounded solutions to the average cost Poisson equation arising in the risk-sensitive case. Also of importance is the fact that our developments are very much self-contained and employ only basic probabilistic and analysis principles.

[1]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[2]  Arie Hordijk,et al.  Dynamic programming and Markov potential theory , 1974 .

[3]  Rolando Cavazos-Cadena,et al.  Controlled Markov chains with risk-sensitive criteria: Average cost, optimality equations, and optimal solutions , 1999, Math. Methods Oper. Res..

[4]  Steven I. Marcus,et al.  Controlled Markov processes on the infinite planning horizon: Weighted and overtaking cost criteria , 1994, Math. Methods Oper. Res..

[5]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[6]  Ambar G. Rao,et al.  Improving productivity by periodic performance evaluation: a Bayesian stochastic model , 1995 .

[7]  W. Fleming,et al.  Risk sensitive control of finite state machines on an infinite horizon. I , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[8]  Lukasz Stettner,et al.  Risk-Sensitive Control of Discrete-Time Markov Processes with Infinite Horizon , 1999, SIAM J. Control. Optim..

[9]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[10]  S. Marcus,et al.  Risk sensitive control of Markov processes in countable state space , 1996 .

[11]  W. Fleming,et al.  Risk sensitive optimal control and differential games , 1992 .

[12]  M. J. Sobel,et al.  Discounted MDP's: distribution functions and exponential utility maximization , 1987 .

[13]  M. K. Ghosh,et al.  Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[14]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[15]  E. Fernández-Gaucherand,et al.  Controlled Markov chains with discounted risk-sensitive criteria: Applications to machine replacement , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[16]  E. Fernández-Gaucherand,et al.  Risk-sensitive optimal control of hidden Markov models: structural results , 1997, IEEE Trans. Autom. Control..

[18]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[19]  S. P. Meynz,et al.  Risk Sensitive Optimal Control: Existence and Synthesis for Models with Unbounded Cost , 1999 .

[20]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[21]  A. Swiech Risk-sensitive control and differential games in infinite dimensions☆ , 2002 .

[22]  E. Fernandez-Gaucherand,et al.  Controlled Markov chains with exponential risk-sensitive criteria: modularity, structured policies and applications , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[23]  E. Fernández-Gaucherand,et al.  Controlled Markov chains with risk-sensitive criteria: some (counter) examples , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[24]  E. Fernández-Gaucherand,et al.  Controlled Markov chains with risk-sensitive exponential average cost criterion , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[25]  W. Fleming,et al.  Risk-Sensitive Control of Finite State Machines on an Infinite Horizon I , 1997 .

[26]  S. Ross Arbitrary State Markovian Decision Processes , 1968 .

[27]  P. Whittle Risk-Sensitive Optimal Control , 1990 .

[28]  L. Stettner,et al.  Some results on risk sensitive adaptive control of discrete time Markov processes , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[29]  D. Varberg Convex Functions , 1973 .

[30]  Daniel Hernández-Hernández,et al.  Risk Sensitive Markov Decision Processes , 1997 .

[31]  R. Phelps Convex Functions, Monotone Operators and Differentiability , 1989 .

[32]  Henk Tijms,et al.  Stochastic modelling and analysis: a computational approach , 1986 .

[33]  S.,et al.  Multiplicative Ergodicityfor an Irreducible Markov Chain , 1999 .