Solving inspection and maintenance problem of deteriorating system based on Q-learning

This paper establishes the model which aims at inspection and maintenance issue as to the deteriorating system during discrete state and continuous time by the Semi-Markov Decision Process. Due to the probability concerning state transition is difficult to derived, in addition to escape local optimal result, a algorithm which combines the concept of Q-learning and simulated annealing is proposed in this article to get the optimal maintenance policy. Finally we obtain the optimized result in both average and discount criteria, and the simulation result indicates the feasibility of this method. Furthermore, the paper discusses the influence of inspection interval on the optimized average cost by the emulational data, which is in accordance with the fact.