Markov decision processes with state-dependent discount factors and unbounded rewards/costs

Abstract This paper deals with discrete-time Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Under general conditions, we develop an iteration algorithm for computing the optimal value function, and also prove the existence of optimal stationary policies. Furthermore, we illustrate our results with a cash-balance model.

[1]  E. Altman Constrained Markov Decision Processes , 1999 .

[2]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[3]  Arie Hordijk,et al.  Blackwell optimality in the class of all policies in Markov decision chains with a Borel state space and unbounded rewards , 1999, Math. Methods Oper. Res..

[4]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[5]  V. Borkar A convex analytic approach to Markov decision processes , 1988 .

[6]  W. Pizer,et al.  Discounting the Distant Future: How Much Do Uncertain Rates Increase Valuations? , 2001 .

[7]  O. Hernández-Lerma,et al.  Discounted Cost Markov Decision Processes on Borel Spaces: The Linear Programming Formulation , 1994 .

[8]  A. Piunovskiy Optimal Control of Random Sequences in Problems with Constraints , 1997 .

[9]  Juan González-Hernández,et al.  Markov control processes with randomized discounted cost , 2007, Math. Methods Oper. Res..

[10]  Yair Carmon,et al.  Markov decision processes with exponentially representable discounting , 2009, Oper. Res. Lett..

[11]  Eugene A. Feinberg,et al.  Markov Decision Models with Weighted Discounted Criteria , 1994, Math. Oper. Res..

[12]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[13]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[14]  Umit Ozlale,et al.  The effects of different inflation risk premiums on interest rate spreads , 2004 .

[15]  Manfred SchÄl,et al.  Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal , 1975 .