Zero-sum Markov games and worst-case optimal control of queueing systems

Zero-sum stochastic games model situations where two persons, called players, control some dynamic system, and both have opposite objectives. One player wishes typically to minimize a cost which has to be paid to the other player. Such a game may also be used to model problems with a single controller who has only partial information on the system: the dynamic of the system may depend on some parameter that is unknown to the controller, and may vary in time in an unpredictable way. A worst-case criterion may be considered, where the unknown parameter is assumed to be chosen by “nature” (called player 1), and the objective of the controller (player 2) is then to design a policy that guarantees the best performance under worst-case behaviour of nature. The purpose of this paper is to present a survey of stochastic games in queues, where both tools and applications are considered. The first part is devoted to the tools. We present some existing tools for solving finite horizon and infinite horizon discounted Markov games with unbounded cost, and develop new ones that are typically applicable in queueing problems. We then present some new tools and theory of expected average cost stochastic games with unbounded cost. In the second part of the paper we present a survey on existing results on worst-case control of queues, and illustrate the structural properties of best policies of the controller, worst-case policies of nature, and of the value function. Using the theory developed in the first part of the paper, we extend some of the above results, which were known to hold for finite horizon costs or for the discounted cost, to the expected average cost.

[1]  Ger Koole,et al.  On the optimality of LEPT and μc rules for parallel processors and dependent arrival processes , 1993, Advances in Applied Probability.

[2]  Eitan Altman,et al.  Worst-case and Nash routing policies in parallel queues with uncertain service allocations , 1993 .

[3]  Aurel A. Lazar,et al.  A Game Theoretic Approach to Decentralized Flow Control of Markovian Queueing Networks , 1987, Performance.

[4]  Rommert Dekker,et al.  On the Relation Between Recurrence and Ergodicity Properties in Denumerable Markov Decision Chains , 1994, Math. Oper. Res..

[5]  Jr. Shaler Stidham Optimal control of admission to a queueing system , 1985 .

[6]  Vivek S. Borkar,et al.  Denumerable state stochastic games with limiting average payoff , 1993 .

[7]  A. Hordijk,et al.  On ergodicity and recurrence properties of a Markov chain by an application to an open jackson network , 1992, Advances in Applied Probability.

[8]  J Jaap Wessels,et al.  Markov games with unbounded rewards , 1976 .

[9]  Ali Allahverdi,et al.  Stochastic Scheduling and Dynamic Programming , 1996 .

[10]  E. Altman,et al.  Stochastic scheduling games with Markov decision arrival processes , 1992 .

[11]  Vivek S. Borkar,et al.  Control of Markov Chains with Long-Run Average Cost Criterion , 1988 .

[12]  George S. Lueker A note on the average-case behavior of a simple differencing method for partitioning , 1987 .

[13]  Ulrich Rieder,et al.  Non-Cooperative Dynamic Games with General Utility Functions , 1991 .

[14]  Linn I. Sennott,et al.  Zero-sum stochastic games with unbounded costs: Discounted and average cost cases , 1994, Math. Methods Oper. Res..

[15]  R. Weber,et al.  Optimal control of service rates in networks of queues , 1987, Advances in Applied Probability.

[16]  E. Altman Monotonicity of Optimal Policies in a Zero Sum Game: A Flow Control Model , 1994 .

[17]  B. Hajek Optimal control of two interacting service stations , 1982, 1982 21st IEEE Conference on Decision and Control.

[18]  T. E. S. Raghavan,et al.  Algorithms for stochastic games — A survey , 1991, ZOR Methods Model. Oper. Res..

[19]  J. Ben Atkinson,et al.  An Introduction to Queueing Networks , 1988 .

[20]  Ger Koole,et al.  On the Assignment of Customers to Parallel Queues , 1992, Probability in the Engineering and Informational Sciences.

[21]  Eitan Altman,et al.  Non zero-sum stochastic games in admission, service and routing control in queueing systems , 1996, Queueing Syst. Theory Appl..

[22]  Arie Hordijk,et al.  Optimal service control against worst case admission policies: A multichained stochastic game , 1997, Math. Methods Oper. Res..

[23]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[24]  A. Hordijk,et al.  Contraction Conditions for Average and α-Discount Optimality in Countable State Markov Games with Unbounded Rewards , 1997, Math. Oper. Res..

[25]  Guy Pujolle,et al.  Introduction to queueing networks , 1987 .

[26]  S. Vajda,et al.  Contribution to the Theory of Games , 1951 .

[27]  R. Cavazos-Cadena Recent results on conditions for the existence of average optimal stationary policies , 1991 .

[28]  Steven A. Lippman,et al.  Applying a New Device in the Optimization of Exponential Queuing Systems , 1975, Oper. Res..

[29]  A. Nowak On zero-sum stochastic games with general state space , 1984 .

[30]  Arie Hordijk,et al.  Dynamic programming and Markov potential theory , 1974 .

[31]  Dean Gillette,et al.  9. STOCHASTIC GAMES WITH ZERO STOP PROBABILITIES , 1958 .

[32]  Refael Hassin,et al.  Stable priority purchasing in queues , 1986 .

[33]  A. Hordijk,et al.  On the convergence of moments in stationary markov chains : (prepublication) , 1974 .

[34]  Aurel A. Lazar,et al.  On the existence of equilibria in noncooperative optimal flow control , 1995, JACM.

[35]  Refael Hassin,et al.  Equilibrium strategies and the value of information in a two line queueing system with threshold jockeying , 1994 .

[36]  Moshe Haviv Stable strategies for processor sharing systems , 1991 .

[37]  Arie Hordijk,et al.  Average, Sensitive and Blackwell Optimal Policies in Denumerable Markov Decision Chains with Unbounded Rewards , 1988, Math. Oper. Res..