A multi-cluster time aggregation approach for Markov chains

Abstract This work focuses on the computation of the steady state distribution of a Markov chain, making use of an embedding algorithm. In this regard, a well-known approach dubbed time aggregation has been proposed in Cao et al., (2002). Roughly, the idea hinges on the partition of the state space into two subsets. The linchpin in this partitioning process is a small subset of states, selected to be the state space of the aggregated process, which will account for the state space of the embedded semi-Markov process. Although this approach has provided an interesting body of theoretical results and advanced in the study of the so-called curse of dimensionality, one is still left with a high-dimensional problem to be solved. In this paper we investigate the possibility to remedy this problem by proposing a time aggregation approach with multiple subsets. This is achieved by devising a decomposition algorithm which makes use of a partition scheme to evaluate the steady state probabilities of the chain. Besides the convergence proof of the algorithm, we prove also a result for the cardinality of the partition, vis-a-vis the computational effort of the algorithm, for the case in which the state space is partitioned in a collection of subsets of the same cardinality.

[1]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[2]  I. Marek,et al.  Iterative aggregation/disaggregation method for computing stationary probability vectors of Markov type operators , 1996 .

[3]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[4]  Thomas A. Manteuffel,et al.  Multilevel Adaptive Aggregation for Markov Chains, with Application to Web Ranking , 2008, SIAM J. Sci. Comput..

[5]  G. P. Barker,et al.  Convergent iterations for computing stationary distributions of markov , 1986 .

[6]  Olle Häggström Finite Markov Chains and Algorithmic Applications , 2002 .

[7]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[8]  Katja Biswas An iterative aggregation and disaggregation method for the steady state solution of large scale continuous systems , 2015, Comput. Phys. Commun..

[9]  Daniel P. Heyman,et al.  Comparisons Between Aggregation/Disaggregation and a Direct Algorithm for Computing the Stationary Probabilities of a Markov Chain , 1995, INFORMS J. Comput..

[10]  P. Varaiya,et al.  Multilayer control of large Markov chains , 1978 .

[11]  Marcelo D. Fragoso,et al.  Time aggregated Markov decision processes via standard dynamic programming , 2011, Oper. Res. Lett..

[12]  Zhiyuan Ren,et al.  A time aggregation approach to Markov decision processes , 2002, Autom..

[13]  William J. Stewart,et al.  Introduction to the numerical solution of Markov Chains , 1994 .

[14]  H. Khalil,et al.  Aggregation of the policy iteration method for nearly completely decomposable Markov chains , 1991 .

[15]  Adam Shwartz,et al.  Exact finite approximations of average-cost countable Markov decision processes , 2007, Autom..

[16]  Udo R. Krieger,et al.  On a two-level multigrid solution method for finite Markov chains , 1995 .

[17]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[18]  Heinz Koeppl,et al.  Optimal Kullback–Leibler Aggregation via Information Bottleneck , 2013, IEEE Transactions on Automatic Control.

[19]  Changfeng Ma,et al.  Parallel multisplitting iteration methods based on M-splitting for the PageRank problem , 2015, Appl. Math. Comput..

[20]  M. Freidlin,et al.  Random Perturbations of Dynamical Systems , 1984 .

[21]  M. Haviv Aggregation/disaggregation methods for computing the stationary distribution of a Markov chain , 1987 .