论文信息 - Risk-Sensitive Markov Decision Processes

Risk-Sensitive Markov Decision Processes

This paper considers the maximization of certain equivalent reward generated by a Markov decision process with constant risk sensitivity. First, value iteration is used to optimize possibly time-varying processes of finite duration. Then a policy iteration procedure is developed to find the stationary policy with highest certain equivalent gain for the infinite duration case. A simple example demonstrates both procedures.

R. Howard | J. Matheson

[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .