Singularly perturbed Markov control problem: Limiting average cost

In this paper we consider a singularly perturbed Markov decision process with the limiting average cost criterion. We assume that the underlying process is composed ofn separate irreducible processes, and that the small perturbation is such that it “unites” these processes into a single irreducible process. We formulate the underlying control problem for the singularly perturbed MDP, and call it the “limit Markov control problem” (limit MCP). We prove the validity of the “the limit control principle” which states that an optimal solution to the perturbed MDP can be approximated by an optimal solution of the limit MCP for any sufficiently small perturbation. We also demonstrate that the limit Markov control problem is equivalent to a suitably constructed nonlinear program in the space of long-run state-action frequencies. This approach combines the solutions of the original separated irreducible MDPs with the stationary distribution of a certain “aggregated MDP” and creates a framework for future algorithmic approaches.