论文信息 - Optimal decision procedures for finite Markov chains. Part II: Communicating systems

Optimal decision procedures for finite Markov chains. Part II: Communicating systems

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a given convex family of distributions depending on the present state. The immediate cost is prescribed for each choice and it is required to minimise the average expected cost over an infinite future. The paper considers a special case of this general problem and provides the foundation for a general solution. The main result is that an optimal policy exists if each state of the system can be reached with positive probability from any other state by choosing a suitable policy.

J. Bather

[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .

[2] W. Barry. On the Iterative Method of Dynamic Programming on a Finite Space Discrete Time Markov Process , 1965 .

[3] A. F. Veinott. ON FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING WITH NO DISCOUNTING , 1966 .

[4] C. Derman,et al. A SOLUTION TO A COUNTABLE SYSTEM OF EQUATIONS ARISING IN MARKOVIAN DECISION PROCESSES. , 1966 .

[5] E. Lanery,et al. Étude asymptotique des systèmes markoviens à commande , 1967 .

[6] S. Ross. NON-DISCOUNTED DENUMERABLE MARKOVIAN DECISION MODELS , 1968 .

[7] Arie Hordijk,et al. A sufficient condition for the existence of an optimal policy with respect to the average cost criterion in markovian decision processes : Prepublication , 1971 .

[8] Over een doeblinvoorwaarde en haar toepassing in beslissingsprocessen : Voorlopige uitgave , 1972 .

[9] J. Bather. Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.