A K-step look-ahead analysis of value iteration algorithms for Markov decision processes
暂无分享,去创建一个
[1] Jo van Nunen,et al. A set of successive approximation methods for discounted Markovian decision problems , 1976, Math. Methods Oper. Res..
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] Thomas E. Morton. Technical Note - Undiscounted Markov Renewal Programming Via Modified Successive Approximations , 1971, Oper. Res..
[4] M. Puterman,et al. Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .
[5] P. Schweitzer. Iterative solution of the functional equations of undiscounted Markov renewal programming , 1971 .
[6] Harold J. Kushner,et al. Accelerated procedures for the solution of discrete Markov control problems , 1971 .
[7] Katsuhisa Ohno,et al. Computing Optimal Policies for Controlled Tandem Queueing Systems , 1987, Oper. Res..
[8] Henk Tijms,et al. Stochastic modelling and analysis: a computational approach , 1986 .
[9] Evan L. Porteus. Some Bounds for Discounted Sequential Decision Processes , 1971 .
[10] J. A. E. E. van Nunen. Contracting Markov decision processes , 1976 .
[11] Moshe Haviv,et al. Truncated policy iteration methods , 1984 .
[12] Uri Yechiali,et al. Criteria for selecting the relaxation factor of the value iteration algorithm for undiscounted Markov and semi-Markov decision processes , 1991, Oper. Res. Lett..
[13] Evan L. Porteus. Bounds and Transformations for Discounted Finite Markov Decision Chains , 1975, Oper. Res..
[14] J.A.E.E. van Nunen,et al. The action elimination algorithm for Markov decision processes , 1976 .
[15] Martin L. Puterman,et al. Action Elimination Procedures for Modified Policy Iteration Algorithms , 1982, Oper. Res..
[16] L. Thomas,et al. Computational comparison of value iteration algorithms for discounted Markov decision processes , 1983 .
[17] U. Yechiali,et al. Accelerating Procedures of the Value Iteration Algorithm for Discounted Markov Decision Processes, Based on a One-Step Lookahead Analysis , 1994 .
[18] Evan L. Porteus,et al. Technical Note - Accelerated Computation of the Expected Discounted Return in a Markov Chain , 1978, Oper. Res..
[19] David McMillan,et al. State-dependent control of call arrivals in layered cellular mobile networks , 1993, Telecommun. Syst..
[20] J. Popyack,et al. Discrete versions of an algorithm due to Varaiya , 1979 .