论文信息 - Controlled Markov chains with risk-sensitive criteria: Average cost, optimality equations, and optimal solutions

Controlled Markov chains with risk-sensitive criteria: Average cost, optimality equations, and optimal solutions

Abstract. We study controlled Markov chains with denumerable state space and bounded costs per stage. A (long-run) risk-sensitive average cost criterion, associated to an exponential utility function with a constant risk sensitivity coefficient, is used as a performance measure. The main assumption on the probabilistic structure of the model is that the transition law satisfies a simultaneous Doeblin condition. Working within this framework, the main results obtained can be summarized as follows: If the constant risk-sensitivity coefficient is small enough, then an associated optimality equation has a bounded solution with a constant value for the optimal risk-sensitive average cost; in addition, under further standard continuity-compactness assumptions, optimal stationary policies are obtained. However, it is also shown that the above conclusions fail to hold, in general, for large enough values of the risk-sensitivity coefficient. Our results therefore disprove previous claims on this topic. Also of importance is the fact that our developments are very much self-contained and employ only basic probabilistic and analysis principles.

Rolando Cavazos-Cadena | Emmanuel Fernández-Gaucherand | E. Fernández-Gaucherand | R. Cavazos-Cadena

[1] E. Fernández-Gaucherand,et al. Risk-sensitive optimal control of hidden Markov models: structural results , 1997, IEEE Trans. Autom. Control..

[2] T. Runolfsson. The equivalence between infinite-horizon optimal control of stochastic systems with exponential-of-integral performance index and stochastic differential games , 1994, IEEE Trans. Autom. Control..

[3] V. Borkar. On Minimum Cost Per Unit Time Control of Markov Chains , 1984 .

[4] Rhodes,et al. Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .

[5] O. Hernández-Lerma,et al. Discrete-time Markov control processes , 1999 .

[6] Daniel Hernández-Hernández,et al. Risk Sensitive Markov Decision Processes , 1997 .

[7] K. Hinderer,et al. Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[8] E. Fernández-Gaucherand,et al. Non-standard optimality criteria for stochastic control problems , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[9] M. K. Ghosh,et al. Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[10] S.,et al. Risk-Sensitive Control and Dynamic Games for Partially Observed Discrete-Time Nonlinear Systems , 1994 .

[11] P. Whittle. Risk-Sensitive Optimal Control , 1990 .

[12] W. Fleming,et al. Risk-Sensitive Control of Finite State Machines on an Infinite Horizon I , 1997 .

[13] Michel Loève,et al. Probability Theory I , 1977 .

[14] R. Cavazos-Cadena. Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains , 1988 .

[15] E. Fernández-Gaucherand,et al. Controlled Markov chains with risk-sensitive exponential average cost criterion , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[16] S. Marcus,et al. Risk sensitive control of Markov processes in countable state space , 1996 .

[17] R. Cavazos-Cadena. Necessary conditions for the optimality equation in average-reward Markov decision processes , 1989 .