Joint Strategy Fictitious Play with Inertia for Potential Games

We consider finite multi-player repeated games involving a large number of players with large strategy spaces and enmeshed utility structures. In these "large-scale" games, players are inherently faced with limitations in both their observational and computational capabilities. Accordingly, players in large-scale games need to make their decisions using algorithms that accommodate limitations in information gathering and processing. A motivating example is a congestion game in a complex transportation system, in which a large number of vehicles make daily routing decisions to optimize their own objectives in response to their observations. In this setting, observing and responding to the individual actions of all vehicles on a daily basis would be a formidable task for any individual driver. This disqualifies some of the well known decision making models such as "Fictitious Play" (FP) as suitable models for driver routing behavior. A more realistic assumption on the information tracked and processed by an individual driver is the daily aggregate congestion on the specific roads that are of interest to that driver. We will show that Joint Strategy Fictitious Play (JSFP), a close variant of FP, accommodates such information aggregation. Furthermore, we establish the convergence of JSFP to a pure Nash equilibrium in congestion games, or equivalently in finite potential games, when players use some inertia in their decisions and in both cases of with or without exponential discounting of the historical data.

[1]  J G Wardrop,et al.  CORRESPONDENCE. SOME THEORETICAL ASPECTS OF ROAD TRAFFIC RESEARCH. , 1952 .

[2]  J. G. Wardrop,et al.  Some Theoretical Aspects of Road Traffic Research , 1952 .

[3]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[4]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[5]  Moshe Ben-Akiva,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1985 .

[6]  E. Cascetta,et al.  A DAY-TO-DAY AND WITHIN-DAY DYNAMIC STOCHASTIC ASSIGNMENT MODEL , 1991 .

[7]  Moshe Ben-Akiva,et al.  Dynamic network models and driver information systems , 1991 .

[8]  H. Young,et al.  The Evolution of Conventions , 1993 .

[9]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[10]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[11]  L. Shapley,et al.  Potential Games , 1994 .

[12]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[13]  L. Shapley,et al.  REGULAR ARTICLEPotential Games , 1996 .

[14]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[15]  Daniel Friedman,et al.  Individual Learning in Normal Form Games: Some Laboratory Results☆☆☆ , 1997 .

[16]  D. Monderer,et al.  Fictitious play and- no-cycling conditions , 1997 .

[17]  Josef Hofbauer,et al.  Evolutionary Games and Population Dynamics , 1998 .

[18]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[19]  H. Young Individual Strategy and Social Structure , 2020 .

[20]  Samer Madanat,et al.  Perception updating and day-to-day travel choice dynamics in traffic networks with information provision , 1998 .

[21]  Players With Limited Memory , 2004 .

[22]  William H. Sandholm,et al.  Evolutionary Implementation and Congestion Pricing , 2002 .

[23]  Tim Roughgarden,et al.  The price of anarchy is independent of the network topology , 2002, STOC '02.

[24]  J. Hofbauer,et al.  Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[25]  E. J. Collins,et al.  Convergent multiple-timescales reinforcement learning algorithms in normal form games , 2003 .

[26]  Sergiu Hart,et al.  Regret-based continuous-time dynamics , 2003, Games Econ. Behav..

[27]  Berthold Vöcking,et al.  On the Evolution of Selfish Routing , 2004, ESA.

[28]  H. Peyton Young,et al.  Strategic Learning and Its Limits , 2004 .

[29]  Gürdal Arslan,et al.  Distributed convergence to Nash equilibria with local utility measurements , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[30]  Igal Milchtaich,et al.  Social optimality and cooperation in nonatomic congestion games , 2004, J. Econ. Theory.

[31]  Robert L. Smith,et al.  A Fictitious Play Approach to Large-Scale Optimization , 2005, Oper. Res..

[32]  Jason R. Marden,et al.  Joint Strategy Fictitious Play with Inertia for Potential Games , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[33]  David S. Leslie,et al.  Individual Q-Learning in Normal Form Games , 2005, SIAM J. Control. Optim..

[34]  Jeff S. Shamma,et al.  Dynamic fictitious play, dynamic gradient play, and distributed convergence to Nash equilibria , 2005, IEEE Transactions on Automatic Control.

[35]  S. Hart Adaptive Heuristics , 2005 .

[36]  Berthold Vöcking,et al.  Adaptive routing with stale information , 2005, PODC '05.

[37]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[38]  David S. Leslie,et al.  Generalised weakened fictitious play , 2006, Games Econ. Behav..

[39]  Berthold Vöcking,et al.  Fast convergence to Wardrop equilibria by adaptive sampling methods , 2006, STOC '06.

[40]  Avrim Blum,et al.  Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games , 2006, PODC '06.

[41]  Yishay Mansour,et al.  The communication complexity of uncoupled nash equilibrium procedures , 2007, STOC '07.

[42]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.