论文信息 - Cost rate heuristics for semi-Markov decision processes

Cost rate heuristics for semi-Markov decision processes

In response to the computational complexity of the dynamic programming/ backwards induction approach to the development of optimal policies for semiMarkov decision processes, we propose a class of heuristics resulting from an inductive process which proceeds forwards in time. These heuristics always choose actions in such a way as to minimize some measure of the current cost rate. We describe a procedure for calculating such cost rate heuristics. The quality of the performance of such policies is related to the speed of evolution (in a cost sense) of the process. A simple model of preventive maintenance is described in detail. Cost rate heuristics for this problem are calculated and assessed computationally. DYNAMIC PROGRAMMING; REPLACEMENT POLICY AMS 1991 SUBJECT CLASSIFICATION: PRIMARY 90C39

Michael P. Bailey | Kevin D. Glazebrook | Lyn R. Whitaker

[1] C. S. Chen,et al. A discounted cost relationship , 1988 .

[2] Michael N. Katehakis,et al. The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..

[3] Terje Aven,et al. Optimal replacement under a minimal repair strategy—a general failure model , 1983, Advances in Applied Probability.

[4] T. Aven,et al. Optimal replacement times — a general set-up , 1986, Journal of Applied Probability.

[5] D. Blackwell. Discounted Dynamic Programming , 1965 .

[6] C. White. Bounds on optimal cost for a replacement problem with partial observations , 1979 .

[7] S. Christian Albright,et al. Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..

[8] K. Glazebrook. Strategy evaluation for stochastic scheduling problems with order constraints , 1991, Advances in Applied Probability.

[9] David Ruppert,et al. Sequential Nonparametric Age Replacement Policies , 1985 .