Discrete Dynamic Programming with a Small Interest Rate

Abstract : In a fundamental paper on stationary finite state and action Markovian decision processes, Blackwell defines an optimal policy to be one that maximizes the expected total discounted rewards for all sufficiently small interest rates rho > 0. He also establishes the existence of a stationary optimal policy by a limit process that does not give a finite algorithm. The purpose of this paper is to prove this result constructively by devising a finite policy improvement method for finding stationary optimal policies. The algorithm is based on a new representation of the vector of expected discounted returns under a stationary policy as a power series in the interest rate for all small enough rho > 0. (Author)