A New Value Iteration method for the Average Cost Dynamic Programming Problem
暂无分享,去创建一个
[1] D. White. Dynamic programming, Markov chains, and the method of successive approximations , 1963 .
[2] Amedeo R. Odoni,et al. On Finding the Maximal Gain for Markov Decision Processes , 1969, Oper. Res..
[3] P. Schweitzer. Iterative solution of the functional equations of undiscounted Markov renewal programming , 1971 .
[4] P. Varaiya. Optimal and suboptimal stationary controls for Markov chains , 1978 .
[5] J. Popyack,et al. Discrete versions of an algorithm due to Varaiya , 1979 .
[6] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[7] P. Tseng. Solving H-horizon, stationary Markov decision problems in time proportional to log(H) , 1990 .
[8] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[9] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .