Strategic Teaching and Learning in Games

It is known that there are uncoupled learning heuristics leading to Nash equilibrium in all finite games. Why should players use such learning heuristics and where could they come from? We show that there is no uncoupled learning heuristic leading to Nash equilibrium in all finite games that a player has an incentive to adopt, that would be "evolutionary stable" or that "could learn itself". Rather, a player has an incentive to strategically teach such a learning opponent in order secure at least the Stackelberg leader payoff. The impossibility result remains intact when restricted to the classes of generic games, two-player games, potential games, games with strategic complements or 2x2 games, in which learning is known to be "nice". More generally, it also applies to uncoupled learning heuristics leading to correlated equilibria, rationalizable outcomes, iterated admissible outcomes, or minimal curb sets. A possibility result restricted to "strategically trivial" games fails if some generic games outside this class are considered as well.

[1]  F. Germano Stochastic Evolution of Rules for Playing Normal Form Games , 2004 .

[2]  R. Aumann Subjectivity and Correlation in Randomized Strategies , 1974 .

[3]  David Pearce Rationalizable Strategic Behavior and the Problem of Perfection , 1984 .

[4]  E. Kalai,et al.  Rational Learning Leads to Nash Equilibrium , 1993 .

[5]  J. Weibull,et al.  Strategy subsets closed under rational behavior , 1991 .

[6]  Burkhard C. Schipper,et al.  Strategic Control of Myopic Best Reply in Repeated Games , 2011 .

[7]  Teck-Hua Ho,et al.  Sophisticated Experience-Weighted Attraction Learning and Strategic Teaching in Repeated Games , 2002, J. Econ. Theory.

[8]  Martin W. Cripps,et al.  Some Asymptotic Results in Discounted Repeated Games of One-Sided Incomplete Information , 2003, Math. Oper. Res..

[9]  Eitan Israeli,et al.  Sowing Doubt Optimally in Two-Person Repeated Games , 1999 .

[10]  Glenn Ellison,et al.  Learning from Personal Experience: One Rational Guy and the Justification of Myopia , 1997 .

[11]  B. Bernheim Rationalizable Strategic Behavior , 1984 .

[12]  Burkhard C. Schipper Imitators and Optimizers in Cournot Oligopoly , 2007 .

[13]  Andreu Mas-Colell,et al.  Stochastic Uncoupled Dynamics and Nash Equilibrium , 2004, Games Econ. Behav..

[14]  D. Fudenberg,et al.  Evolution and Cooperation in Noisy Repeated Games , 1990 .

[15]  D. Fudenberg,et al.  Conditional Universal Consistency , 1999 .

[16]  R. Vohra,et al.  Calibrated Learning and Correlated Equilibrium , 1996 .

[17]  Teck-Hua Ho,et al.  A learning-based model of repeated games with incomplete information , 2005, Games Econ. Behav..

[18]  A. Robson,et al.  The Evolution of Strategic Sophistication , 2016 .

[19]  Peter Duersch,et al.  When is tit-for-tat unbeatable? , 2013, Int. J. Game Theory.

[20]  H P Young,et al.  On the impossibility of predicting the behavior of rational agents , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Burkhard C. Schipper,et al.  Rage Against the Machines: How Subjects Learn to Play Against Computers , 2005 .

[22]  Yakov Babichenko CENTER FOR THE STUDY OF RATIONALITY , 2007 .

[23]  Philippe Mongin,et al.  Infinite Regressions in the Optimizing Theory of Decision , 1988 .

[24]  Andreu Mas-Colell,et al.  A General Class of Adaptive Strategies , 1999, J. Econ. Theory.

[25]  Martin W. Cripps,et al.  Reputation and commitment in two-person repeated games , 1995 .

[26]  J. Jordan,et al.  Bayesian learning in normal form games , 1991 .

[27]  H. Moulin Dominance Solvable Voting Schemes , 1979 .

[28]  Jonathan Shalev,et al.  Nonzero-Sum Two-Person Repeated Games with Incomplete Information and Known-Own Payoffs , 1994 .

[29]  Itai Ashlagi,et al.  Robust Learning Equilibrium , 2006, UAI.

[30]  John Nachbar,et al.  Bayesian learning in repeated games of incomplete information , 2001, Soc. Choice Welf..

[31]  R. Bhaskar,et al.  How to decide how to decide , 1983 .

[32]  H. Peyton Young,et al.  Strategic Learning and Its Limits , 2004 .

[33]  Andrew Schotter,et al.  CONVERGENCE: AN EXPERIMENTAL STUDY OF TEACHING AND LEARNING IN REPEATED GAMES , 2012 .

[34]  H. Peyton Young,et al.  Learning, hypothesis testing, and Nash equilibrium , 2003, Games Econ. Behav..

[35]  Moshe Tennenholtz,et al.  Learning equilibrium as a generalization of learning to optimize , 2007, Artif. Intell..

[36]  L. Shapley,et al.  Potential Games , 1994 .

[37]  Dean Phillips Foster,et al.  Regret Testing: Learning to Play Nash Equilibrium Without Knowing You Have an Opponent , 2006 .

[38]  J. Biggs THE ROLE OF METALEARNING IN STUDY PROCESSES , 1985 .

[39]  Paul R. Milgrom,et al.  Rationalizability, Learning, and Equilibrium in Games with Strategic Complementarities , 1990 .

[40]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[41]  S. Demichelis Efficient Coordination in Repeated Games: Behavioral Maxims , 2013 .

[42]  Fabrizio Germano,et al.  Global Nash Convergence of Foster and Young's Regret Testing , 2004, Games Econ. Behav..

[43]  Yong-Gwan Kim,et al.  Evolutionarily stable strategies in the repeated prisoner's dilemma , 1994 .

[44]  John Nachbar Prediction, optimization, and learning in repeated games , 1997 .

[45]  Ronen I. Brafman,et al.  Efficient learning equilibrium , 2004, Artificial Intelligence.

[46]  A. Bandura,et al.  Longitudinal Analysis of the Role of Perceived Self-Efficacy for Self-Regulated Learning in Academic Continuance and Achievement. , 2008 .

[47]  Sylvain Sorin,et al.  Merging, Reputation, and Repeated Games with Incomplete Information , 1999 .

[48]  L. Samuelson,et al.  Evolutionary stability in repeated games played by finite automata , 1992 .

[49]  H. Peyton Young,et al.  Learning by trial and error , 2009, Games Econ. Behav..

[50]  Peter Duersch,et al.  Unbeatable Imitation , 2010, Games Econ. Behav..

[51]  Sham M. Kakade,et al.  Deterministic calibration and Nash equilibrium , 2004, J. Comput. Syst. Sci..

[52]  Barton L. Lipman How to Decide How to Decide How to. . . : Modeling Limited Rationality , 1991 .

[53]  D. Fudenberg,et al.  Reputation and Equilibrium Selection in Games with a Patient Player , 1989 .

[54]  Jonathan P. Beauchamp,et al.  GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment , 2013, Science.

[55]  F. Vega-Redondo The evolution of Walrasian behavior , 1997 .

[56]  Avi Wigderson,et al.  On Play by Means of Computing Machines , 1986, TARK.

[57]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[58]  Martin W. Cripps,et al.  Reputation in perturbed repeated games , 1996 .

[59]  G. Mailath,et al.  Repeated Games and Reputations , 2006 .

[60]  R. Aumann,et al.  Cooperation and bounded recall , 1989 .

[61]  John Nachbar,et al.  Beliefs in Repeated Games , 2003 .

[62]  A. Terracol,et al.  Dumbing down rational players: Learning and teaching in an experimental game , 2009 .

[63]  S. Hart,et al.  A simple adaptive procedure leading to correlated equilibrium , 2000 .

[64]  J. Hofbauer,et al.  Uncoupled Dynamics Do Not Lead to Nash Equilibrium , 2003 .