Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy

In this paper we construct an algorithm of successive approximation type that computes a locally optimal periodic policy for a Markov decision chain with partial state information. The algorithm is applied to a queueing network with server control.

[1]  Ger Koole,et al.  On the optimality of LEPT and μc rules for parallel processors and dependent arrival processes , 1993, Advances in Applied Probability.

[2]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[3]  N. Hastings,et al.  Markov programming with policy constraints , 1979 .

[4]  Chelsea C. White,et al.  A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..

[5]  S. Marcus,et al.  Decentralized control of finite state Markov processes , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[6]  John L. Smith Markov Decisions on a Partitioned State Space , 1971, IEEE Trans. Syst. Man Cybern..

[7]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[8]  Ger Koole,et al.  Analysis of a Customer Assignment Model with No State Information , 1994, Probability in the Engineering and Informational Sciences.

[9]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[10]  Eitan Altman,et al.  Closed-loop control with delayed information , 1992, SIGMETRICS '92/PERFORMANCE '92.

[11]  Henk Tijms,et al.  Stochastic modelling and analysis: a computational approach , 1986 .

[12]  Michael Pinedo,et al.  A note on queues with Bernoulli routing , 1990, 29th IEEE Conference on Decision and Control.

[13]  Ger Koole Stochastic scheduling and dynamic programming , 1995 .

[14]  Frits C. Schoute Decentralized control in packet switched satellite communication , 1978 .

[15]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[16]  Stefan Schneeberger Markov-Entscheidungs-Prozesse mit abhängigen Aktionen für optimale Reparaturmaßnahmen bei unvollständiger Information , 1992 .

[17]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .