A two-state partially observable Markov decision process with three actions

A process can be in either a stable or an unstable state interchangeably. The true state is unobservable and can only be inferred from observations. Three actions are available: continue with the process (CON), repair the process for a certain fee – bring the process to the stable state (REP), and obtain the state of the process for a cost (INS). The objective is to maximize the expected discounted value of the total future profits. We formulate the problem as a discrete-time Partially Observable Markov Decision Process (POMDP). We show that the expected profit function is convex and strictly increasing, and that the optimal policy has either one or two control limits. Also, we show that “dominance in expectation” (the expected revenue is larger in the stable state than in the unstable state) suffices for a control limit structure.

[1]  P. Hammond,et al.  Ending Mandatory Retirement for Tenured Faculty: The Consequences for Higher Education. , 1991 .

[2]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[3]  Steffen L. Lauritzen,et al.  Representing and Solving Decision Problems with Limited Information , 2001, Manag. Sci..

[4]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[5]  M. Wessels,et al.  Clinical practice. Streptococcal pharyngitis. , 2011, The New England journal of medicine.

[6]  George E. Monahan,et al.  A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .

[7]  Daniel E. Lane,et al.  A Partially Observable Model of Decision Making by Fishermen , 1989, Oper. Res..

[8]  Steven L. Shafer,et al.  Comparison of Some Suboptimal Control Policies in Medical Drug Therapy , 1996, Oper. Res..

[9]  S. Christian Albright,et al.  Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..

[10]  Yossi Aviv,et al.  A Partially Observed Markov Decision Process for Dynamic Pricing , 2005, Manag. Sci..

[11]  Eric Rosenberg,et al.  Quantitative Methods in Credit Management: A Survey , 1994, Oper. Res..

[12]  Chelsea C. White,et al.  Solution Procedures for Partially Observed Markov Decision Processes , 1989, Oper. Res..

[13]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[14]  G. J. Lieberman,et al.  On the Use of Replacements to Extend System Life , 1984, Oper. Res..

[15]  S. Ross Quality Control under Markovian Deterioration , 1971 .

[16]  Lu Jin,et al.  An optimal policy for partially observable Markov decision processes with non‐independent monitors , 2005 .

[17]  David Sarne,et al.  Sequencing counts: A combined approach for sequencing and selecting costly unreliable off-line inspections , 2012, Comput. Oper. Res..

[18]  Uri Yechiali,et al.  Optimal repair and replacement in markovian systems , 1994 .

[19]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[20]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Stochastic Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[21]  Uri Yechiali,et al.  OPTIMALITY OF CONTROL LIMIT MAINTENANCE POLICIES UNDER NONSTATIONARY DETERIORATION , 1999, Probability in the Engineering and Informational Sciences.

[22]  Tal Ben-Zvi,et al.  Partially Observed Markov Decision Processes with Binomial Observations , 2013, Oper. Res. Lett..

[23]  Yigal Gerchak,et al.  Production to order and off‐line inspection when the production process is partially observable , 2007 .

[24]  Tal Ben-Zvi,et al.  Intruder Detection: An Optimal Decision Analysis Strategy , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  L. Thomas A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers , 2000 .

[26]  Dimitri P. Bertsekas,et al.  On Near Optimality of the Set of Finite-State Controllers for Average Cost POMDP , 2008, Math. Oper. Res..

[27]  A. Komaroff,et al.  The prediction of streptococcal pharyngitis in adults , 2007, Journal of General Internal Medicine.

[28]  Abraham Grosfeld-Nir,et al.  Control limits for two-state partially observable Markov decision processes , 2007, Eur. J. Oper. Res..

[29]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[30]  Abraham Grosfeld-Nir,et al.  Using partially observed Markov processes to select optimal termination time of TV shows , 2008 .

[31]  Mark H. Ebell,et al.  Does This Patient Have Strep Throat , 2000 .

[32]  William S. Lovejoy Ordered Solutions for Dynamic Programs , 1987, Math. Oper. Res..

[33]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[34]  Yael Perlman,et al.  Economic design of offline inspections for a batch production process , 2013 .

[35]  P. Poupart Exploiting structure to efficiently solve large scale partially observable Markov decision processes , 2005 .

[36]  Tzvi Raz,et al.  Economic optimization in a fixed sequence of unreliable inspections , 2003, J. Oper. Res. Soc..

[37]  Chelsea C. White,et al.  A survey of solution techniques for the partially observed Markov decision process , 1991, Ann. Oper. Res..

[38]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[39]  Abraham Grosfeld-Nir,et al.  A Two-State Partially Observable Markov Decision Process with Uniformly Distributed Observations , 1996, Oper. Res..

[40]  E. Kaplan,et al.  Diagnosis of streptococcal pharyngitis: differentiation of active infection from the carrier state in the symptomatic child. , 1971, The Journal of infectious diseases.

[41]  A. Bisno,et al.  Diagnosis of strep throat in adults: are clinical criteria really good enough? , 2002, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[42]  L. Thomas,et al.  Optimal inspection policies for standby systems , 1987 .

[43]  D. J. White,et al.  Real Applications of Markov Decision Processes , 1985 .

[44]  R. Pill,et al.  Understanding the culture of prescribing: qualitative study of general practitioners' and patients' perceptions of antibiotics for sore throats , 1998, BMJ.

[45]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[46]  R. Gunnarsson,et al.  In primary health care, never prescribe antibiotics to patients suspected of having an uncomplicated sore throat caused by group A beta-haemolytic streptococci without first confirming the presence of this bacterium , 2012, Scandinavian journal of infectious diseases.

[47]  Shoshana Anily,et al.  An Optimal Lot-Sizing and Offline Inspection Policy in the Case of Nonrigid Demand , 2006, Oper. Res..

[48]  U. Rieder,et al.  Markov Decision Processes with Applications to Finance , 2011 .

[49]  Donald P. Gaver,et al.  Inspection Models and their application , 1991 .