On N-person stochastic games by denumerable state space

This paper considers non-cooperative N-person stochastic games with a countable state space and compact metric action spaces. We concentrate upon the average return per unit time criterion for which the existence of an equilibrium policy is established under a number of recurrency conditions with respect to the transition probability matrices associated with the stationary policies. These results are obtained by establishing the existence of total discounted return equilibrium policies, for each discount factor α ∈ [0, 1) and by showing that under each one of the aforementioned recurrency conditions, average return equilibrium policies appear as limit policies of sequences of discounted return equilibrium policies, with discount factor tending to one. Finally, we review and extend the results that are known for the case where both the state space and the action spaces are finite.

[1]  I. Glicksberg A FURTHER GENERALIZATION OF THE KAKUTANI FIXED POINT THEOREM, WITH APPLICATION TO NASH EQUILIBRIUM POINTS , 1952 .

[2]  K. Fan Fixed-point and Minimax Theorems in Locally Convex Topological Linear Spaces. , 1952, Proceedings of the National Academy of Sciences of the United States of America.

[3]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[4]  J. Doob Stochastic processes , 1953 .

[5]  Dean Gillette,et al.  9. STOCHASTIC GAMES WITH ZERO STOP PROBABILITIES , 1958 .

[6]  D. Blackwell Discrete Dynamic Programming , 1962 .

[7]  Review: John E. Walsh, Handbook of Nonparametric Statistics , 1965 .

[8]  H. M. Taylor Markovian sequential replacement processes , 1965 .

[9]  D. Blackwell Discounted Dynamic Programming , 1965 .

[10]  C. Derman,et al.  A Note on Memoryless Rules for Controlling Sequential Control Processes , 1966 .

[11]  P. Billingsley,et al.  Convergence of Probability Measures , 1970, The Mathematical Gazette.

[12]  P. Schweitzer Perturbation theory and finite Markov chains , 1968 .

[13]  S. Lippman,et al.  Stochastic Games with Perfect Information and Time Average Payoff , 1969 .

[14]  B. L. Miller,et al.  Discrete Dynamic Programming with a Small Interest Rate , 1969 .

[15]  Philip D Rogers,et al.  NONZERO-SUM STOCHASTIC GAMES , 1969 .

[16]  M. J. Sobel Noncooperative Stochastic Games , 1971 .

[17]  P. Schweitzer Iterative solution of the functional equations of undiscounted Markov renewal programming , 1971 .

[18]  Matthew J. Sobel,et al.  Continuous stochastic games , 1973, Journal of Applied Probability.

[19]  J. Bather Optimal decision procedures for finite Markov chains. Part II: Communicating systems , 1973, Advances in Applied Probability.

[20]  J. Bather Optimal decision procedures for finite markov chains. Part I: Examples , 1973, Advances in Applied Probability.

[21]  Arie Hordijk,et al.  Dynamic programming and Markov potential theory , 1974 .

[22]  Eugene Seneta,et al.  Non‐Negative Matrices , 1975 .

[23]  C. J. Himmelberg,et al.  Existence of p-equilibrium and optimal stationary strategies in stochastic games , 1976 .

[24]  Arie Hordijk,et al.  Semi-markov strategies in stochastic games , 1983 .

[25]  H. Tijms,et al.  Exponential convergence of products of stochastic matrices , 1977 .

[26]  Paul J. Schweitzer,et al.  The Functional Equations of Undiscounted Markov Renewal Programming , 1971, Math. Oper. Res..

[27]  A. Federgruen,et al.  The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms , 1978 .