论文信息 - Solution to the risk-sensitive average optimality equation in communicating Markov decision chains with finite state space: An alternative approach

Solution to the risk-sensitive average optimality equation in communicating Markov decision chains with finite state space: An alternative approach

Abstract. This note concerns Markov decision chains with finite state and action sets. The decision maker is assumed to be risk-averse with constant risk sensitive coefficient λ, and the performance of a control policy is measured by the risk-sensitive average cost criterion. In their seminal paper Howard and Matheson established that, when the whole state space is a communicating class under the action of each stationary policy, then there exists a solution to the optimality equation for every λ>0. This paper presents an alternative proof of this fundamental result, which explicitly highlights the essential role of the communication properties in the analysis of the risk-sensitive average cost criterion.

Daniel Hernández-Hernández | Rolando Cavazos-Cadena | D. Hernández-Hernández | R. Cavazos-Cadena

[1] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .

[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3] R. Howard,et al. Risk-Sensitive Markov Decision Processes , 1972 .

[4] S. Marcus,et al. Risk sensitive control of Markov processes in countable state space , 1996 .