Optimal Stopping in a Partially Observable Markov Process with Costly Information

A problem of optimal stopping in a Markov chain whose states are not directly observable is presented. Using the theory of partially observable Markov decision processes, a model which combines the classical stopping problem with sequential sampling at each stage of the decision process is developed. Several results which characterize the optimal expected value function in terms of its parameters are given. An example is given which indicates that the best action to take as a function of the information currently available may not be of the intuitively appealing control limit type. The set of states at which it is optimal to purchase information need not be convex. The expected value of information as a function of the decision maker's knowledge is related to nonmonotone optimal policies.

[1]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[2]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[3]  K. M. vanHee,et al.  Bayesian control of Markov chains , 1978 .

[4]  T. Yoshikawa,et al.  Discrete-Time Markovian Decision Processes with Incomplete State Observation , 1970 .

[5]  Ronald A. Howard,et al.  Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..

[6]  D. Rhenius Incomplete Information in Markovian Decision Models , 1974 .

[7]  J. J. Martin Bayesian Decision Problems and Markov Chains , 1967 .

[8]  William P. Pierskalla,et al.  A survey of maintenance models: The control and surveillance of deteriorating systems , 1976 .

[9]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state information , 1965 .

[10]  J. Wessels Decision rules in Markovian decision processes with incompletely known transition probabilities , 1968 .

[11]  J. K. Satia,et al.  Markovian Decision Processes with Uncertain Transition Probabilities , 1973, Oper. Res..

[12]  C. White Optimal Inspection and Repair of a Production Process Subject to Deterioration , 1978 .

[13]  James E. Eckles,et al.  Optimum Maintenance with Incomplete Information , 1968, Oper. Res..

[14]  J. Ponssard On the Concept of the Value of Information in Competitive Situations , 1976 .

[15]  M. A. Girshick,et al.  A BAYES APPROACH TO A QUALITY CONTROL MODEL , 1952 .

[16]  S. Ehrenfeld,et al.  On a sequential Markovian decision procedure with incomplete information , 1976, Comput. Oper. Res..

[17]  S. Ross Quality Control under Markovian Deterioration , 1971 .

[18]  N. Furukawa A BAYES CONTROLLED PROCESS , 1967 .

[19]  Donald B. Rosenfield,et al.  Markovian Deterioration with Uncertain Information , 1976, Oper. Res..

[20]  Jacob Marschak,et al.  ECONOMICS OF INFORMATION SYSTEMS , 1971 .

[21]  E. Dynkin Controlled Random Sequences , 1965 .

[22]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[23]  Charlotte Striebel,et al.  Optimal Control of Discrete Time Stochastic Systems , 1975 .

[24]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Stochastic Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[25]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[26]  M. Degroot Uncertainty, Information, and Sequential Experiments , 1962 .

[27]  H. M. Taylor Markovian sequential replacement processes , 1965 .

[28]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state-information II. The convexity of the lossfunction , 1969 .

[29]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..