论文信息 - On the computation of the optimal cost function for discrete time Markov models with partial observations - 字舞流文

On the computation of the optimal cost function for discrete time Markov models with partial observations

We consider several applications of two state, finite action, infinite horizon, discrete-time Markov decision processes with partial observations, for two special cases of observation quality, and show that in each of these cases the optimal cost function is piecewise linear. This in turn allows us to obtain either explicit formulas or simplified algorithms to compute the optimal cost function and the associated optimal control policy. Several examples are presented.

Steven I. Marcus | Enrique L. Sernik | S. Marcus | E. Sernik

[1] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .

[2] R. Bellman. Dynamic programming. , 1957, Science.

[3] S. Pollock. Minimum-Cost Checking Using Imperfect Information , 1967 .

[4] J. J. Martin. Bayesian Decision Problems and Markov Chains , 1967 .

[5] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .

[6] S. Ross. Quality Control under Markovian Deterioration , 1971 .

[7] J. Satia,et al. Markovian Decision Processes with Probabilistic Observation of States , 1973 .

[8] Robert C. Wang. Computing optimal quality control policies — two actions , 1976 .

[9] Chelsea C. White,et al. A Markov Quality Control Process Subject to Partial Observation , 1977 .

[10] P. Schweitzer,et al. DISCOUNTED AND UNDISCOUNTED VALUE-ITERATION IN MARKOV DECISION PROBLEMS: A SURVEY , 1977 .

[11] Robert C. Wang,et al. OPTIMAL REPLACEMENT POLICY WITH UNOBSERVABLE STATES , 1977 .

[12] Katsushige Sawaki,et al. Transformation of Partially Observable Markov Decision Processes Into Piecewise Linear Ones (佐々木専三郎教授記念号) , 1978 .

[13] C. White. Optimal Inspection and Repair of a Production Process Subject to Deterioration , 1978 .

[14] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[15] K. Sawaki,et al. OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON , 1978 .

[16] S. Christian Albright,et al. Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..

[17] C. White. Bounds on optimal cost for a replacement problem with partial observations , 1979 .

[18] John S. Hughes. Technical Note - A Note on Quality Control under Markovian Deterioration , 1980, Oper. Res..

[19] P. Kumar,et al. On the optimal solution of the one-armed bandit adaptive control problem , 1981 .

[20] G. Monahan. Optimal stopping in a partially observable binary-valued markov chain with costly perfect information , 1982 .

[21] S. Kim. State information lag markov decision process with control limit rule , 1985 .

[22] Shinhong Kim,et al. A Partially Observable Markov Decision Process with Lagged Information , 1987 .

[23] L. Thomas,et al. Optimal inspection policies for standby systems , 1987 .

[24] W. Hopp,et al. Multiaction maintenance under Markovian deterioration and incomplete state information , 1988 .

[25] Chelsea C. White. Note on “A Partially Observable Markov Decision Process with Lagged Information” , 1988 .

[26] S. Marcus,et al. Comments on the Sensitivity of the Optimal Cost and the Optimal Policy for a Discrete Markov Decision Process , 1989 .

[27] A. Arapostathis,et al. On partially observable Markov decision processes with an average cost criterion , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[28] Adi Ben-Israel,et al. Input optimization for infinite-horizon discounted programs , 1989 .

[29] Ari Arapostathis,et al. On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes , 1991, Ann. Oper. Res..

[30] S. Marcus,et al. Optimal cost and policy for a Markovian replacement problem , 1991 .