Graph partitioning techniques for Markov Decision Processes decomposition

Recently, several authors have proposed serial decomposition techniques for solving approximately or exactly large Markov Decision Processes. Many of these techniques rely on a clever partitioning of the state space in roughly independent parts. The different parts only communicate through relatively few communicating states. The efficiency of these decomposition methods clearly depends on the size of the set of communicating states : the smaller, the better. However, the task of finding such a decomposition with few communicating states is often (if not always) left to the user. In this paper, we present automated decomposition techniques for MDPs. These techniques are based on methods which have been developed for graph partitioning problems.

[1]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[2]  Craig Boutilier,et al.  Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..

[3]  Bruce Hendrickson,et al.  An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations , 1995, SIAM J. Sci. Comput..

[4]  Michael L. Littman,et al.  Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[5]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[6]  Thomas Dean,et al.  Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.

[7]  H. Kushner,et al.  Decomposition of systems governed by Markov chains , 1974 .

[8]  Doina Precup,et al.  Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.

[9]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[10]  Milos Hauskrecht,et al.  Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.

[11]  R. Bellman Dynamic programming. , 1957, Science.

[12]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[13]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[14]  Ronald Parr,et al.  Flexible Decomposition Algorithms for Weakly Coupled Markov Decision Problems , 1998, UAI.

[15]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..