Player-compatible learning and player-compatible equilibrium

Player-Compatible Equilibrium (PCE) imposes cross-player restrictions on the magnitudes of the players' "trembles" onto different strategies. These restrictions capture the idea that trembles correspond to deliberate experiments by agents who are unsure of the prevailing distribution of play. PCE selects intuitive equilibria in a number of examples where trembling-hand perfect equilibrium (Selten, 1975) and proper equilibrium (Myerson, 1978) have no bite. We show that rational learning and some commonly used heuristics imply our compatibility restrictions in a steady-state setting.

[1]  Bruno H. Strulovici Learning While Voting: Determinants of Collective Experimentation , 2010 .

[2]  D. Fudenberg,et al.  Justified Communication Equilibrium , 2021 .

[3]  Choosing a good toolkit, I: Prior-free heuristics , 2020 .

[4]  Ehud Lehrer,et al.  Partially-Specified Probabilities: Decisions and Games , 2006 .

[5]  David M. Kreps,et al.  Learning in Extensive Games, II: Experimentation and Nash Equilibrium , 2010 .

[6]  Pierpaolo Battigalli,et al.  Analysis of information feedback and selfconfirming equilibrium , 2016 .

[7]  Aurélien Garivier,et al.  On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.

[8]  Bhaskar Krishnamachari,et al.  Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations , 2010, IEEE/ACM Transactions on Networking.

[9]  Daniel Friedman,et al.  Individual Learning in Normal Form Games: Some Laboratory Results☆☆☆ , 1997 .

[10]  A. Rubinstein,et al.  Rationalizable Conjectural Equilibrium: Between Nash and Rationalizability , 1994 .

[11]  Navin Kartik,et al.  Optimal Contracts for Experimentation , 2016 .

[12]  Philipp Strack,et al.  Strategic Experimentation with Private Payoffs , 2015, J. Econ. Theory.

[13]  Drew Fudenberg,et al.  Payoff information and learning in signaling games , 2020, Games Econ. Behav..

[14]  D. Fudenberg,et al.  Learning and Type Compatibility in Signaling Games , 2017, 1702.01819.

[15]  Strategic Experimentation with Exponential Bandits , 2005 .

[16]  L. Shapley,et al.  Potential Games , 1994 .

[17]  H Robbins,et al.  Sequential choice from several populations. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[18]  D. Fudenberg,et al.  Steady state learning and Nash equilibrium , 1993 .

[19]  J. Mertens,et al.  ON THE STRATEGIC STABILITY OF EQUILIBRIA , 1986 .

[20]  M. Jackson,et al.  A Strategic Model of Social and Economic Networks , 1996 .

[21]  R. Selten Reexamination of the perfectness concept for equilibrium points in extensive games , 1975, Classics in Game Theory.

[22]  Drew Fudenberg,et al.  Learning in extensive-form games I. Self-confirming equilibria , 1995 .

[23]  Y. Ishii,et al.  Innovation Adoption by Forward-Looking Social Learners , 2015 .

[24]  Andrzej Skrzypacz,et al.  Learning, Experimentation, and Information Design , 2017 .

[25]  Roland G. Fryer,et al.  Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability , 2013, Math. Oper. Res..

[26]  E. Damme Stability and perfection of Nash equilibria , 1987 .

[27]  R. Myerson Refinements of the Nash equilibrium concept , 1978 .

[28]  T. Lai Adaptive treatment allocation and the multi-armed bandit problem , 1987 .

[29]  Colin Camerer,et al.  Experience‐weighted Attraction Learning in Normal Form Games , 1999 .

[30]  D. Fudenberg,et al.  Superstition and Rational Learning , 2006 .

[31]  E. Damme Refinements of the Nash Equilibrium Concept , 1983 .

[32]  Josef Hofbauer,et al.  Learning in games with unstable equilibria , 2005, J. Econ. Theory.

[33]  David M. Kreps,et al.  Signaling Games and Stable Equilibria , 1987 .

[34]  Drew Fudenberg,et al.  Limit Points of Endogenous Misspecified Learning , 2021 .

[35]  David Pearce Rationalizable Strategic Behavior and the Problem of Perfection , 1984 .

[36]  Laura Doval,et al.  Whether or not to open Pandora's box , 2018, J. Econ. Theory.

[37]  J. Gittins Bandit processes and dynamic allocation indices , 1979 .

[38]  M. Cripps,et al.  Strategic Experimentation with Exponential Bandits , 2003 .

[39]  Massimo Marinacci,et al.  Learning and self-confirming long-run biases , 2019, J. Econ. Theory.

[40]  John H. Kagel,et al.  Learning and transfer in signaling games , 2008 .

[41]  Sven Rady,et al.  Negatively Correlated Bandits , 2008 .

[42]  Joshua Mollner,et al.  Extended Proper Equilibrium , 2020, J. Econ. Theory.

[43]  David M. Kreps,et al.  Learning Mixed Equilibria , 1993 .

[44]  Pierpaolo Battigalli,et al.  Conjectural Equilibria and Rationalizability in a Game with Incomplete Information , 1997 .

[45]  Wei Chen,et al.  Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.

[46]  Ignacio Esponda,et al.  Berk-Nash Equilibrium: A Framework for Modeling Agents with Misspecified Models , 2014, 1411.1152.

[47]  D. Fudenberg,et al.  Bayesian posteriors for arbitrarily rare events , 2016, Proceedings of the National Academy of Sciences.

[48]  Zheng Wen,et al.  Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2014, AISTATS.

[49]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[50]  David M. Kreps,et al.  Choosing a good toolkit, II: Bayes-rule based heuristics , 2020 .