Solving Markovian decision processes by successive elimination of variables

Abstract The functional equations of stationary, infinite-horizon, Markovian decision processes can be solved by eliminating the variables, one at a time, until only one variable and one equation are left. Back substitution in the resulting triangular system of transformed functional equations then completes the solution. The idea completely parallels the Gauss-Jordan pivoting algorithm for solving a nonsingular system of linear equations. This method provides a computational scheme plus proofs of existence of a solution. Uniqueness of solution is demonstrated in the discounted case. In the undiscounted case, variables are eliminated successively from a mixed system of equalities and inequalities until the remaining variables may be chosen arbitrarily, i.e., the solution is no longer unique. In this case, a minimal solution, subject to these free choices, is exhibited.