Discounted semi-Markov decision processes : linear programming and policy iteration
暂无分享,去创建一个
For semi-Markov decision processes with discounted rewards we derive the well known results regarding the structure of optimal strategies (nonrandomized, stationary Markov strategies) and the standard algorithms (linear programming, policy iteration). Our analysis is completely based on a primal linear programming formulation of the problem.
[1] William S. Jewell,et al. Markov-Renewal Programming. I: Formulation, Finite Return Models , 1963 .
[2] Steven Vajda,et al. Linear Programming. Methods and Applications , 1964 .
[3] G. D. Eppen,et al. Linear Programming Solutions for Separable Markovian Decision Problems , 1967 .
[4] E. Denardo. CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .
[5] A. S. Harding. Markovian decision processes , 1970 .