Reactive learning strategies for iterated games

In an iterated game between two players, there is much interest in characterizing the set of feasible pay-offs for both players when one player uses a fixed strategy and the other player is free to switch. Such characterizations have led to extortionists, equalizers, partners and rivals. Most of those studies use memory-one strategies, which specify the probabilities to take actions depending on the outcome of the previous round. Here, we consider ‘reactive learning strategies’, which gradually modify their propensity to take certain actions based on past actions of the opponent. Every linear reactive learning strategy, p*, corresponds to a memory one-strategy, p, and vice versa. We prove that for evaluating the region of feasible pay-offs against a memory-one strategy, C(p), we need to check its performance against at most 11 other strategies. Thus, C(p) is the convex hull in R2 of at most 11 points. Furthermore, if p is a memory-one strategy, with feasible pay-off region C(p), and p* is the corresponding reactive learning strategy, with feasible pay-off region C(p∗), then C(p∗) is a subset of C(p). Reactive learning strategies are therefore powerful tools in restricting the outcomes of iterated games.

[1]  M. Nowak Evolutionary Dynamics: Exploring the Equations of Life , 2006 .

[2]  Guillaume Fréchette,et al.  The Evolution of Cooperation in Infinitely Repeated Games: Experimental Evidence , 2011 .

[3]  Ethan Akin,et al.  What You Gotta Know to Play Good in the Iterated Prisoner's Dilemma , 2015, Games.

[4]  M. Posch,et al.  Win-stay, lose-shift strategies for repeated games-memory length, aspiration levels and noise. , 1999, Journal of theoretical biology.

[5]  M. Nowak,et al.  Partners and rivals in direct reciprocity , 2018, Nature Human Behaviour.

[6]  Joshua B. Plotkin,et al.  Small groups and long memories promote cooperation , 2016, Scientific Reports.

[7]  Richard H. Enns,et al.  The Numerical Approach , 2004 .

[8]  Arne Traulsen,et al.  Partners or rivals? Strategies for the iterated prisoner's dilemma☆ , 2015, Games Econ. Behav..

[9]  A. Roth,et al.  Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term* , 1995 .

[10]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[11]  E. Lehrer Repeated Games with Stationary Bounded Recall Strategies , 1988 .

[12]  Hamid Sabourian,et al.  Repeated games with one-memory , 2009, J. Econ. Theory.

[13]  Drew Fudenberg,et al.  Game theory (3. pr.) , 1991 .

[14]  M. Nowak Five Rules for the Evolution of Cooperation , 2006, Science.

[15]  C. Hauert,et al.  Effects of increasing the number of players and memory size in the iterated Prisoner's Dilemma: a numerical approach , 1997, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[16]  M. Nowak,et al.  A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game , 1993, Nature.

[17]  Luis A. Martinez-Vaquero,et al.  Memory-n strategies of direct reciprocity , 2017, Proceedings of the National Academy of Sciences.

[18]  Josef Hofbauer,et al.  Evolutionary Games and Population Dynamics , 1998 .

[19]  Segismundo S. Izquierdo,et al.  Dynamics of the Bush-Mosteller Learning Algorithm in 2x2 Games , 2008 .

[20]  R. R. Bush,et al.  A Stochastic Model with Applications to Learning , 1953 .

[21]  P. Bó Cooperation under the Shadow of the Future: Experimental Evidence from Infinitely Repeated Games , 2005 .

[22]  BÓ Pedrodal,et al.  Cooperation under the Shadow of the Future : Experimental Evidence from Infinitely Repeated Games , 2005 .

[23]  Martin A Nowak,et al.  Comparing reactive and memory-one strategies of direct reciprocity , 2016, Scientific Reports.

[24]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[25]  M. Nowak,et al.  The evolution of stochastic strategies in the Prisoner's Dilemma , 1990 .

[26]  W. Press,et al.  Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent , 2012, Proceedings of the National Academy of Sciences.