A General Class of Adaptive Strategies

We exhibit and characterize an entire class of simple adaptive strategies, in the repeated play of a game, having the Hannan-consistency property: in the long-run, the player is guaranteed an average payoff as large as the best-reply payoff to the empirical distribution of play of the other players; i.e., there is no “regret.” Smooth fictitious play (Fudenberg and Levine [1995, J. Econ. Dynam. Control19, 1065–1090]) and regret-matching (Hart and Mas-Colell [2000, Econometrica68, 1127–1150]) are particular cases. The motivation and application of the current paper come from the study of procedures whose empirical distribution of play is, in the long run, (almost) a correlated equilibrium. For the analysis we first develop a generalization of Blackwell's (1956, Pacific J. Math.6, 1–8) approachability strategy for games with vector payoffs. Journal of Economic Literature Classification Numbers: C7, D7, C6.

[1]  K. Pearson Biometrika , 1902, The American Naturalist.

[2]  J. Robinson AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[3]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[4]  Michel Loève,et al.  Probability Theory I , 1977 .

[5]  A. Banos On Pseudo-Games , 1968 .

[6]  N. Megiddo On repeated games with incomplete information played by non-Bayesian players , 1980 .

[7]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[8]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[9]  D. Monderer,et al.  Belief Affirming in Learning Processes , 1997 .

[10]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[11]  S. D. Chatterji Proceedings of the International Congress of Mathematicians , 1995 .

[12]  Dean P. Foster,et al.  Calibrated Learning and Correlated Equilibrium , 1997 .

[13]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[14]  P. Rivière Quelques modeles de jeux d'evolution , 1997 .

[15]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[16]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[17]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[18]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[19]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[20]  S. Hart,et al.  A Reinforcement Procedure Leading to Correlated Equilibrium , 2001 .