Cost rate heuristics for semi-Markov decision processes

In response to the computational complexity of the dynamic programming/ backwards induction approach to the development of optimal policies for semiMarkov decision processes, we propose a class of heuristics resulting from an inductive process which proceeds forwards in time. These heuristics always choose actions in such a way as to minimize some measure of the current cost rate. We describe a procedure for calculating such cost rate heuristics. The quality of the performance of such policies is related to the speed of evolution (in a cost sense) of the process. A simple model of preventive maintenance is described in detail. Cost rate heuristics for this problem are calculated and assessed computationally. DYNAMIC PROGRAMMING; REPLACEMENT POLICY AMS 1991 SUBJECT CLASSIFICATION: PRIMARY 90C39