论文信息 - A methodology for computation reduction for specially structured large scale Markov decision problems

A methodology for computation reduction for specially structured large scale Markov decision problems

Abstract Markov Decision Processes (MDP's) deal with sequential decision making in stochastic systems. Existing solution techniques provide powerful tools for determining the optimal policy set in such systems. However, many problems have extremely large state and action spaces making them computationally intractable. Typically, the state variable definition is n -dimensiona and the number of states expands at a rate proportional to the power of n . For such large problems, the need for large amounts of random access memory and computation time restricts the ability to obtain solutions. The purpose of this paper is both to present a methodology which takes advantage of the structure of many large scale problems (i.e., problems with a high percentage of transient states under optimal control), and to provide computational results indicating the value of the approach.

Thom J. Hodgson | Russell E. King | Fong-Yuen Ding

[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .

[2] Thomas E. Morton. Technical Note - On the Asymptotic Convergence Rate of Cost Differences for Markovian Decision Processes , 1971, Oper. Res..

[3] Harvey M. Wagner. On the Optimality of Pure Strategies , 1960 .

[4] A. S. Manne. Linear Programming and Sequential Decisions , 1960 .

[5] Gary J. Koehler,et al. Computation techniques for large scale undiscounted markov decision processes , 1979 .

[6] J. Muckstadt,et al. Protective stocks in multi-stage production systems , 1984 .

[7] P. Schweitzer,et al. Generalized polynomial approximations in Markovian decision processes , 1985 .

[8] G. Dantzig,et al. Linear Programming in a Markov Chain , 1962 .

[9] D. White,et al. Dynamic programming, Markov chains, and the method of successive approximations , 1963 .

[10] Thom J. Hodgson,et al. RAPID CONVERGENCE TECHNIQUES FOR MARKOV DECISION PROCESSES , 1975 .

[11] Amedeo R. Odoni,et al. On Finding the Maximal Gain for Markov Decision Processes , 1969, Oper. Res..

[12] Rolf A. Deininger,et al. Generalization of White's Method of Successive Approximations to Periodic Markovian Decision Processes , 1972, Oper. Res..

[13] Paul J. Schweitzer,et al. The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems , 1977, Math. Oper. Res..