A General Class of Adaptive Strategies

We exhibit and characterize an entire class of simple adaptive strategies, in the repeated play of a game, having the Hannan-consistency property: In the long-run, the player is guaranteed an average payoff as large as the best-reply payoff to the empirical distribution of play of the other players; i.e., there is no "regret." Smooth fictitious play (Fudenberg and Levine [1995]) and regret-matching (Hart and Mas-Colell [1998]) are particular cases. The motivation and application of this work come from the study of procedures whose empirical distribution of play is, in the long-run, (almost) a correlated equilibrium. The basic tool for the analysis is a generalization of Blackwell's [1956a] approachability strategy for games with vector payoffs.

[1]  Y. Freund,et al.  Adaptive game playing using multiplicative weights , 1999 .

[2]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[3]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[4]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[5]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[6]  Dean P. Foster,et al.  A Randomization Rule for Selecting Forecasts , 1993, Oper. Res..

[7]  J. Robinson AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.

[8]  D. Monderer,et al.  Belief Affirming in Learning Processes , 1997 .

[9]  S. Hart,et al.  A Reinforcement Procedure Leading to Correlated Equilibrium , 2001 .

[10]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[11]  D. Fudenberg,et al.  Consistency and Cautious Fictitious Play , 1995 .

[12]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[13]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[14]  P. Rivière Quelques modeles de jeux d'evolution , 1997 .

[15]  Dean P. Foster,et al.  Calibrated Learning and Correlated Equilibrium , 1997 .

[16]  N. Megiddo On repeated games with incomplete information played by non-Bayesian players , 1980 .

[17]  Howard Raiffa,et al.  Games And Decisions , 1958 .

[18]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[19]  Michel Loève,et al.  Probability Theory I , 1977 .

[20]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[21]  A. Banos On Pseudo-Games , 1968 .