Joint Strategy Fictitious Play with Inertia for Potential Games

We consider finite multi-player repeated games involving a large number of players with large strategy spaces and enmeshed utility structures. In these "large-scale" games, players are inherently faced with limitations in both their observational and computational capabilities. Accordingly, players in large-scale games need to make their decisions using algorithms that accommodate limitations in information gathering and processing. A motivating example is a congestion game in a complex transportation system, in which a large number of vehicles make daily routing decisions to optimize their own objectives in response to their observations. In this setting, observing and responding to the individual actions of all vehicles on a daily basis would be a formidable task for any individual driver. This disqualifies some of the well known decision making models such as "Fictitious Play" (FP) as suitable models for driver routing behavior. A more realistic assumption on the information tracked and processed by an individual driver is the daily aggregate congestion on the specific roads that are of interest to that driver. We will show that Joint Strategy Fictitious Play (JSFP), a close variant of FP, accommodates such information aggregation. Furthermore, we establish the convergence of JSFP to a pure Nash equilibrium in congestion games, or equivalently in finite potential games, when players use some inertia in their decisions and in both cases of with or without exponential discounting of the historical data.

[1]  William H. Sandholm,et al.  Evolutionary Implementation and Congestion Pricing , 2002 .

[2]  Robert L. Smith,et al.  A Fictitious Play Approach to Large-Scale Optimization , 2005, Oper. Res..

[3]  E. Cascetta,et al.  A DAY-TO-DAY AND WITHIN-DAY DYNAMIC STOCHASTIC ASSIGNMENT MODEL , 1991 .

[4]  Josef Hofbauer,et al.  Evolutionary Games and Population Dynamics , 1998 .

[5]  Players With Limited Memory , 2004 .

[6]  H. Young,et al.  The Evolution of Conventions , 1993 .

[7]  J G Wardrop,et al.  CORRESPONDENCE. SOME THEORETICAL ASPECTS OF ROAD TRAFFIC RESEARCH. , 1952 .

[8]  Tim Roughgarden,et al.  The price of anarchy is independent of the network topology , 2002, STOC '02.

[9]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[10]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[11]  J. Hofbauer,et al.  Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[12]  Daniel Friedman,et al.  Individual Learning in Normal Form Games: Some Laboratory Results☆☆☆ , 1997 .

[13]  David S. Leslie,et al.  Individual Q-Learning in Normal Form Games , 2005, SIAM J. Control. Optim..

[14]  Jeff S. Shamma,et al.  Dynamic fictitious play, dynamic gradient play, and distributed convergence to Nash equilibria , 2005, IEEE Transactions on Automatic Control.

[15]  Yishay Mansour,et al.  The communication complexity of uncoupled nash equilibrium procedures , 2007, STOC '07.

[16]  D. Monderer,et al.  Fictitious play and- no-cycling conditions , 1997 .

[17]  David S. Leslie,et al.  Generalised weakened fictitious play , 2006, Games Econ. Behav..

[18]  E. J. Collins,et al.  Convergent multiple-timescales reinforcement learning algorithms in normal form games , 2003 .

[19]  Dimitri P. Bertsekas,et al.  Neuro-Dynamic Programming , 2009, Encyclopedia of Optimization.

[20]  S. Hart Adaptive Heuristics , 2005 .

[21]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[22]  Berthold Vöcking,et al.  Adaptive routing with stale information , 2005, PODC '05.

[23]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[24]  Berthold Vöcking,et al.  On the Evolution of Selfish Routing , 2004, ESA.

[25]  Moshe Ben-Akiva,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1985 .

[26]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[27]  J. G. Wardrop,et al.  Some Theoretical Aspects of Road Traffic Research , 1952 .

[28]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[29]  H. Peyton Young,et al.  Strategic Learning and Its Limits , 2004 .

[30]  Gürdal Arslan,et al.  Distributed convergence to Nash equilibria with local utility measurements , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[31]  Berthold Vöcking,et al.  Fast convergence to Wardrop equilibria by adaptive sampling methods , 2006, STOC '06.

[32]  Avrim Blum,et al.  Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games , 2006, PODC '06.

[33]  L. Shapley,et al.  Potential Games , 1994 .

[34]  Sergiu Hart,et al.  Regret-based continuous-time dynamics , 2003, Games Econ. Behav..

[35]  L. Shapley,et al.  REGULAR ARTICLEPotential Games , 1996 .

[36]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[37]  Igal Milchtaich,et al.  Social optimality and cooperation in nonatomic congestion games , 2004, J. Econ. Theory.

[38]  Moshe Ben-Akiva,et al.  Dynamic network models and driver information systems , 1991 .

[39]  H. Young Individual Strategy and Social Structure , 2020 .

[40]  Samer Madanat,et al.  Perception updating and day-to-day travel choice dynamics in traffic networks with information provision , 1998 .