论文信息 - Discounted MDP's: distribution functions and exponential utility maximization

Discounted MDP's: distribution functions and exponential utility maximization

The present value of the rewards associated with a discrete-time Markov process has a probability distribution which depends on the initial state. The first part of the paper applies fixed point theory to a system of equations for the distribution functions of the present value. The second part of the paper expands the model to a Markov decision process (MDP) and considers the maximization of the expected utility of the present value when the utility function is exponential.

M. J. Sobel | Kun-Jen Chung