A value iteration method for undiscounted multichain Markov decision processes
暂无分享,去创建一个
[1] Thomas E. Morton. Technical Note - Undiscounted Markov Renewal Programming Via Modified Successive Approximations , 1971, Oper. Res..
[2] Paul J. Schweitzer,et al. The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems , 1977, Math. Oper. Res..
[3] Paul J. Schweitzer,et al. Iterative bounds on the relative value vector in undiscounted Markov renewal programming , 1985, Z. Oper. Research.
[4] A. Hordijk,et al. Linear Programming and Markov Decision Chains , 1979 .
[5] L. C. M. Kallenberg,et al. Linear programming and finite Markovian control problems , 1984 .
[6] J. Wal. The method of value oriented successive approximations for the average reward Markov decision process , 1980 .
[7] E. Denardo,et al. Multichain Markov Renewal Programs , 1968 .
[8] A. F. Veinott. ON FINDING OPTIMAL POLICIES IN DISCRETE DYNAMIC PROGRAMMING WITH NO DISCOUNTING , 1966 .
[9] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[10] E. Denardo. Markov Renewal Programs with Small Interest Rates , 1971 .
[11] P. Schweitzer. Iterative solution of the functional equations of undiscounted Markov renewal programming , 1971 .
[12] Loren Platzman,et al. Technical Note - Improved Conditions for Convergence in Undiscounted Markov Renewal Programming , 1977, Oper. Res..
[13] E. Denardo. A Markov Decision Problem , 1973 .
[14] J. Bather. Optimal decision procedures for finite Markov chains. Part III: General convex systems , 1973 .
[15] Katsuhisa Ohno,et al. Computing Optimal Policies for Controlled Tandem Queueing Systems , 1987, Oper. Res..
[16] W. Barry. On the Iterative Method of Dynamic Programming on a Finite Space Discrete Time Markov Process , 1965 .
[17] Awi Federgruen,et al. A New Specification of the Multichain Policy Iteration Algorithm in Undiscounted Markov Renewal Programs , 1980 .
[18] J. Bather. Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.
[19] P. Schweitzer,et al. Geometric convergence of value-iteration in multichain Markov decision problems , 1979, Advances in Applied Probability.
[20] Amedeo R. Odoni,et al. On Finding the Maximal Gain for Markov Decision Processes , 1969, Oper. Res..
[21] Peter Whittle,et al. Optimization Over Time , 1982 .
[22] Cyrus Derman,et al. Finite State Markovian Decision Processes , 1970 .
[23] D. White,et al. Dynamic programming, Markov chains, and the method of successive approximations , 1963 .
[24] Paul J. Schweitzer,et al. A value-iteration scheme for undiscounted multichain Markov renewal programs , 1984, Z. Oper. Research.
[25] Dieter Spreen,et al. A further anticycling rule in multichain policy iteration for undiscounted Markov renewal programs , 1981, Z. Oper. Research.