On the computation of the optimal cost function for discrete time Markov models with partial observations

We consider several applications of two state, finite action, infinite horizon, discrete-time Markov decision processes with partial observations, for two special cases of observation quality, and show that in each of these cases the optimal cost function is piecewise linear. This in turn allows us to obtain either explicit formulas or simplified algorithms to compute the optimal cost function and the associated optimal control policy. Several examples are presented.

[1]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state information , 1965 .

[2]  R. Bellman Dynamic programming. , 1957, Science.

[3]  S. Pollock Minimum-Cost Checking Using Imperfect Information , 1967 .

[4]  J. J. Martin Bayesian Decision Problems and Markov Chains , 1967 .

[5]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[6]  S. Ross Quality Control under Markovian Deterioration , 1971 .

[7]  J. Satia,et al.  Markovian Decision Processes with Probabilistic Observation of States , 1973 .

[8]  Robert C. Wang Computing optimal quality control policies — two actions , 1976 .

[9]  Chelsea C. White,et al.  A Markov Quality Control Process Subject to Partial Observation , 1977 .

[10]  P. Schweitzer,et al.  DISCOUNTED AND UNDISCOUNTED VALUE-ITERATION IN MARKOV DECISION PROBLEMS: A SURVEY , 1977 .

[11]  Robert C. Wang,et al.  OPTIMAL REPLACEMENT POLICY WITH UNOBSERVABLE STATES , 1977 .

[12]  Katsushige Sawaki,et al.  Transformation of Partially Observable Markov Decision Processes Into Piecewise Linear Ones (佐々木専三郎教授記念号) , 1978 .

[13]  C. White Optimal Inspection and Repair of a Production Process Subject to Deterioration , 1978 .

[14]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[15]  K. Sawaki,et al.  OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON , 1978 .

[16]  S. Christian Albright,et al.  Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..

[17]  C. White Bounds on optimal cost for a replacement problem with partial observations , 1979 .

[18]  John S. Hughes Technical Note - A Note on Quality Control under Markovian Deterioration , 1980, Oper. Res..

[19]  P. Kumar,et al.  On the optimal solution of the one-armed bandit adaptive control problem , 1981 .

[20]  G. Monahan Optimal stopping in a partially observable binary-valued markov chain with costly perfect information , 1982 .

[21]  S. Kim State information lag markov decision process with control limit rule , 1985 .

[22]  Shinhong Kim,et al.  A Partially Observable Markov Decision Process with Lagged Information , 1987 .

[23]  L. Thomas,et al.  Optimal inspection policies for standby systems , 1987 .

[24]  W. Hopp,et al.  Multiaction maintenance under Markovian deterioration and incomplete state information , 1988 .

[25]  Chelsea C. White Note on “A Partially Observable Markov Decision Process with Lagged Information” , 1988 .

[26]  S. Marcus,et al.  Comments on the Sensitivity of the Optimal Cost and the Optimal Policy for a Discrete Markov Decision Process , 1989 .

[27]  A. Arapostathis,et al.  On partially observable Markov decision processes with an average cost criterion , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[28]  Adi Ben-Israel,et al.  Input optimization for infinite-horizon discounted programs , 1989 .

[29]  Ari Arapostathis,et al.  On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes , 1991, Ann. Oper. Res..

[30]  S. Marcus,et al.  Optimal cost and policy for a Markovian replacement problem , 1991 .