A Lyapunov optimization approach to repeated stochastic games

This paper considers a time-varying game with N players. Every time slot, players observe their own random events and then take a control action. The events and control actions affect the individual utilities earned by each player. The goal is to maximize a concave function of time average utilities subject to equilibrium constraints. Specifically, participating players are provided access to a common source of randomness from which they can optimally correlate their decisions. The equilibrium constraints incentivize participation by ensuring that players cannot earn more utility if they choose not to participate. This form of equilibrium is similar to the notions of Nash equilibrium and correlated equilibrium, but is simpler to attain. A Lyapunov method is developed that solves the problem in an online max-weight fashion by selecting actions based on a set of time-varying weights. The algorithm does not require knowledge of the event probabilities. A similar method can be used to compute a standard correlated equilibrium, albeit with increased complexity.

[1]  J. Vial,et al.  Strategically zero-sum games: The class of games whose completely mixed equilibria cannot be improved upon , 1978 .

[2]  R. Srikant,et al.  A tutorial on cross-layer optimization in wireless networks , 2006, IEEE Journal on Selected Areas in Communications.

[3]  Alexander L. Stolyar,et al.  Greedy primal-dual algorithm for dynamic resource allocation in complex networks , 2006, Queueing Syst. Theory Appl..

[4]  R. Aumann Subjectivity and Correlation in Randomized Strategies , 1974 .

[5]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[6]  Leandros Tassiulas,et al.  Resource Allocation and Cross-Layer Control in Wireless Networks , 2006, Found. Trends Netw..

[7]  ModianoEytan,et al.  Fairness and optimal stochastic control for heterogeneous networks , 2008 .

[8]  J. Nash Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Michael J. Neely,et al.  Stability and Probability 1 Convergence for Queueing Networks via Lyapunov Optimization , 2012, J. Appl. Math..

[10]  Bayesian Rationality,et al.  CORRELATED EQUILIBRIUM AS AN EXPRESSION OF , 1987 .

[11]  Nicolas Vieille,et al.  Correlated Equilibrium in Stochastic Games , 2002, Games Econ. Behav..

[12]  R. Aumann Correlated Equilibrium as an Expression of Bayesian Rationality Author ( s ) , 1987 .

[13]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[14]  Frank Kelly,et al.  Charging and rate control for elastic traffic , 1997, Eur. Trans. Telecommun..

[15]  Eric van Damme,et al.  Non-Cooperative Games , 2000 .

[16]  R. Srikant,et al.  Fair Resource Allocation in Wireless Networks Using Queue-Length-Based Scheduling and Congestion Control , 2005, IEEE/ACM Transactions on Networking.

[17]  Eytan Modiano,et al.  Fairness and Optimal Stochastic Control for Heterogeneous Networks , 2005, IEEE/ACM Transactions on Networking.

[18]  Leandros Tassiulas,et al.  Dynamic server allocation to parallel queues with randomly varying connectivity , 1993, IEEE Trans. Inf. Theory.

[19]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[20]  R. Vohra,et al.  Calibrated Learning and Correlated Equilibrium , 1996 .