Some Notes on Dynamic Programming and Replacement
暂无分享,去创建一个
In the first section a modification to Howard's policy improvement routine for Markov decision problems is described. The modified routine normally converges the more rapidly to the optimal policy. In the second section a particular form of recurrence relation, which leads to the rapid determination of improved policies is developed for a certain type of dynamic programming problem. The relation is used to show that the repair limit method is the optimal strategy for a basic equipment replacement problem.
[1] R. Bellman. Dynamic programming. , 1957, Science.
[2] N. A. J. Hastings,et al. An Economic Replacement Model , 1967 .
[3] 飯原 慶雄,et al. Sequential Decision Processes , 1960 .
[4] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .