On the uniqueness of solutions to the Poisson equations for average cost Markov chains with unbounded cost functions

AbstractWe consider the Poisson equations for denumerable Markov chains with unbounded cost functions. Solutions to the Poisson equations exist in the Banach space of bounded real-valued functions with respect to a weighted supremum norm such that the Markov chain is geometrically ergodic. Under minor additional assumptions the solution is also unique. We give a novel probabilistic proof of this fact using relations between ergodicity and recurrence. The expressions involved in the Poisson equations have many solutions in general. However, the solution that has a finite norm with respect to the weighted supremum norm is the unique solution to the Poisson equations. We illustrate how to determine this solution by considering three queueing examples: a multi-server queue, two independent single server queues, and a priority queue with dependence between the queues.

[1]  Sean P. Meyn The policy iteration algorithm for average reward Markov decision processes with general state space , 1997, IEEE Trans. Autom. Control..

[2]  Rommert Dekker,et al.  On the Relation Between Recurrence and Ergodicity Properties in Denumerable Markov Decision Chains , 1994, Math. Oper. Res..

[3]  Rommert Dekker,et al.  Blackwell Optimality in Denumerable Markov Decision Chains , 1984 .

[4]  Ger Koole,et al.  On the bias vector of a two-class preemptive priority queue , 2002, Math. Methods Oper. Res..

[5]  Ger Koole,et al.  On the value function of a priority queue with an application to a controlled polling model , 1999, Queueing Syst. Theory Appl..

[6]  A. Hordijk,et al.  On ergodicity and recurrence properties of a Markov chain by an application to an open jackson network , 1992, Advances in Applied Probability.

[7]  Kai Lai Chung,et al.  Markov Chains with Stationary Transition Probabilities , 1961 .

[8]  Sean P. Meyn,et al.  Value iteration and optimization of multiclass queueing networks , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).

[9]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[10]  Sean P. Meyn The Policy Improvement Algorithm for Markov Decision Processes , 1997 .

[11]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[12]  Rommert Dekker,et al.  Recurrence Conditions for Average and Blackwell Optimality in Denumerable State Markov Decision Chains , 1992, Math. Oper. Res..

[13]  Arie Hordijk,et al.  Average, Sensitive and Blackwell Optimal Policies in Denumerable Markov Decision Chains with Unbounded Rewards , 1988, Math. Oper. Res..

[14]  L. Sennott Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .

[15]  Sandjai Bhulai,et al.  On the structure of value functions for threshold policies in queueing models , 2003, Journal of Applied Probability.

[16]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[17]  W. Fleming Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .

[18]  S. Lippman On Dynamic Programming with Unbounded Rewards , 1975 .

[19]  K. R. Krishnan,et al.  Separable routing: A scheme for state-dependent routing of circuit switched telephone traffic , 1992, Ann. Oper. Res..

[20]  O. Hernández-Lerma,et al.  Further topics on discrete-time Markov control processes , 1999 .

[21]  Paul J. Schweitzer,et al.  Stochastic Models, an Algorithmic Approach , by Henk C. Tijms (Chichester: Wiley, 1994), 375 pages, paperback. , 1996, Probability in the Engineering and Informational Sciences.

[22]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .