Global Nash Convergence of Foster and Young's Regret Testing

We construct an uncoupled randomized strategy of repeated play such that, if every player follows such a strategy, then the joint mixed strategy profiles converge, almost surely, to a Nash equilibrium of the one-shot game. The procedure requires very little in terms of players’ information about the game. In fact, players’ actions are based only on their own past payoffs and, in a variant of the strategy, players need not even know that their payoffs are determined through other players’ actions. The procedure works for general finite games and is based on appropriate modifications of a simple stochastic learning rule introduced by Foster and Young [10].

[1]  J. Hofbauer,et al.  Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .

[2]  Andreu Mas-Colell,et al.  A General Class of Adaptive Strategies , 1999, J. Econ. Theory.

[3]  H. Peyton Young,et al.  Learning, hypothesis testing, and Nash equilibrium , 2003, Games Econ. Behav..

[4]  A. Roth,et al.  Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .

[5]  Sergiu Hart,et al.  Regret-based continuous-time dynamics , 2003, Games Econ. Behav..

[6]  Gábor Lugosi,et al.  Learning correlated equilibria in games with compact sets of strategies , 2007, Games Econ. Behav..

[7]  Klaus Ritzberger,et al.  The theory of normal form games from the differentiable viewpoint , 1994 .

[8]  J. Harsanyi Oddness of the number of equilibrium points: A new proof , 1973 .

[9]  S. Hart Adaptive Heuristics , 2005 .

[10]  Sham M. Kakade,et al.  Deterministic calibration and Nash equilibrium , 2004, J. Comput. Syst. Sci..

[11]  J. Shawe-Taylor Potential-Based Algorithms in On-Line Prediction and Game Theory ∗ , 2001 .

[12]  Andreu Mas-Colell,et al.  Stochastic Uncoupled Dynamics and Nash Equilibrium , 2004, Games Econ. Behav..

[13]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[14]  Marie-Françoise Roy,et al.  Real algebraic geometry , 1992 .

[15]  R. Vohra,et al.  Calibrated Learning and Correlated Equilibrium , 1996 .

[16]  S. Hart,et al.  A Reinforcement Procedure Leading to Correlated Equilibrium , 2001 .

[17]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[18]  Dean P. Foster,et al.  Regret Testing: A Simple Payo-Based Procedure for Learning Nash Equilibrium , 2005 .

[19]  Lawrence E. Blume,et al.  The Algebraic Geometry of Perfect and Sequential Equilibrium , 1994 .

[20]  E. Vandamme Stability and perfection of nash equilibria , 1987 .

[21]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[22]  Drew Fudenberg,et al.  Learning to Play Bayesian Games , 2001, Games Econ. Behav..

[23]  J. Jordan Bayesian Learning in Repeated Games , 1995 .

[24]  Amotz Cahn,et al.  General procedures leading to correlated equilibria , 2004, Int. J. Game Theory.

[25]  D. Fudenberg,et al.  Steady state learning and Nash equilibrium , 1993 .

[26]  Tilman Börgers,et al.  Naive Reinforcement Learning With Endogenous Aspirations , 2000 .

[27]  John Nachbar Prediction, optimization, and learning in repeated games , 1997 .

[28]  J. Jordan,et al.  Bayesian learning in normal form games , 1991 .

[29]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[30]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[31]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[32]  E. Kalai,et al.  Rational Learning Leads to Nash Equilibrium , 1993 .

[33]  R. Rob,et al.  Learning, Mutation, and Long Run Equilibria in Games , 1993 .

[34]  H. Young,et al.  The Evolution of Conventions , 1993 .

[35]  William R. Zame,et al.  The Algebraic Geometry of Games and the Tracing Procedure , 1991 .