Technical Note - The Method of Successive Approximations and Markovian Decision Problems

This note considers HOWARD'S discrete-time Markovian decision model with the average return as criterion. Using results of BLACKWELL AND MACQUEEN for the discounted return model it is shown in all generality that the Odoni bounds contain both the maximal average return and the average return of the current policy.