A methodology for computation reduction for specially structured large scale Markov decision problems

Abstract Markov Decision Processes (MDP's) deal with sequential decision making in stochastic systems. Existing solution techniques provide powerful tools for determining the optimal policy set in such systems. However, many problems have extremely large state and action spaces making them computationally intractable. Typically, the state variable definition is n -dimensiona and the number of states expands at a rate proportional to the power of n . For such large problems, the need for large amounts of random access memory and computation time restricts the ability to obtain solutions. The purpose of this paper is both to present a methodology which takes advantage of the structure of many large scale problems (i.e., problems with a high percentage of transient states under optimal control), and to provide computational results indicating the value of the approach.