Folding Algorithm for Policy Evaluation for Markov Decision Processes With Quasi-Birth Death Structure

This technical note presents a new numerical procedure for policy evaluation of Stochastic Shortest Path Markov Decision Processes (MDPs) having a level independent Quasi Birth-Death structure. The algorithm is derived using a method analogous to the folding method of Ye and Li (1994). The computational complexity is O(M3log2N) + O(M2N), where the process has N levels and M phases. A simple example involving the control of two queues is presented to illustrate the application of this efficient policy evaluation algorithm to compare and rank control policies.