Learning, Experimentation, and Information Design

INTRODUCTION The purpose of this paper is to survey recent developments in a literature that combines ideas from experimentation, learning, and strategic interactions. Because this literature is multifaceted, let us start by circumscribing our overview. First and foremost, all surveyed papers involve nontrivial dynamics. Second, we will restrict attention to models that deal with uncertainty. Models of pure moral hazard, in particular, will not be covered. Third, we exclude papers that focus on monetary transfers. Our goal is to understand incentives via other channels – information in particular, but also delegation. Fourth, we focus on strategic and agency problems, and so leave out papers whose scope is decision-theoretic. However, rules are there to be broken, and we will briefly discuss some papers that deal with one-player problems, to the extent that they are closely related to the issues at hand. Finally, we restrict attention to papers that are relatively recent (specifically, we have chosen to start with Bolton and Harris, 1999). Our survey is divided as follows. First, we start with models of strategic experimentation. These are abstract models with few direct economic applications, but they develop ideas and techniques that percolate through the literature. In these models, players are (usually) symmetric and externalities are (mostly) informational. Moving beyond the exploitation/exploration trade-off, we then turn to agency models that introduce a third dimension: motivation. Experimentation must be incentivized. The first way this can be done (Section 3) is via the information that is being disclosed to the agent performing the experimentation, by a principal who knows more or sees more. A second way this can be done is via control. The nascent literature on delegation in dynamic environments is the subject of Section 4. Section 5 turns to models in which information disclosure is not simply about inducing experimentation, but manipulating the agent's action in broader contexts. To abstract from experimentation altogether, we assume that the principal knows all there is to know, so that only the agent faces uncertainty. Finally, Section 6 discusses experimentation with more than two arms (Callander, 2011). EQUILIBRIUM INTERACTIONS Strategic Bandits Strategic bandit models are game-theoretic versions of standard bandit models. While the standard “multi-armed bandit” describes a hypothetical experiment in which a player faces several slot machines (“one-armed bandits”) with potentially different expected payouts, a strategic bandit involves several players facing (usually, identical) copies of the same slot machine.

[1]  Yishay Mansour,et al.  Implementing the “Wisdom of the Crowd” , 2013, Journal of Political Economy.

[2]  J. Sobel,et al.  STRATEGIC INFORMATION TRANSMISSION , 1982 .

[3]  Jérôme Renault,et al.  Repeated Games with Incomplete Information , 2009, Encyclopedia of Complexity and Systems Science.

[4]  Johanna He Competition in Social Learning , 2017 .

[5]  Caroline D. Thomas Strategic Experimentation With Congestion , 2018, American Economic Journal: Microeconomics.

[6]  Philipp Strack,et al.  Strategic Experimentation with Private Payoffs , 2015, J. Econ. Theory.

[7]  Umberto Garfagnini,et al.  Social Experimentation with Interdependent and Expanding Technologies , 2015 .

[8]  Eilon Solan,et al.  Bandit Problems with Lévy Processes , 2013, Math. Oper. Res..

[9]  Nadya Malenko,et al.  Timing Decisions in Organizations: Communication and Authority in a Dynamic Environment , 2016 .

[10]  Bengt Holmstrom,et al.  On The Theory of Delegation , 1980 .

[11]  Nicolas Vieille,et al.  Social Learning in One-Arm Bandit Problems , 2007 .

[12]  Yeon-Koo Che,et al.  Optimal Design for Social Learning , 2015 .

[13]  Yingni Guo Dynamic Delegation of Experimentation , 2016 .

[14]  Emir Kamenica,et al.  Bayesian Persuasion , 2009 .

[15]  Nicolas Vieille,et al.  Optimal dynamic information provision , 2014, Games Econ. Behav..

[16]  Steven Callander,et al.  Managing on Rugged Landscapes , 2014 .

[17]  M. Cripps,et al.  Strategic Experimentation with Exponential Bandits , 2003 .

[18]  Godfrey Keller,et al.  Strategic Experimentation with Poisson Bandits , 2009 .

[19]  A. Bonatti,et al.  The Politics of Compromise , 2014 .

[20]  Juuso Välimäki,et al.  Learning and Information Aggregation in an Exit Game , 2011 .

[21]  Y. Ishii,et al.  Innovation Adoption by Forward-Looking Social Learners , 2015 .

[22]  Andrzej Skrzypacz,et al.  Persuading the Regulator to Wait , 2016 .

[23]  Strongly Symmetric Equilibria in Bandit Games , 2014 .

[24]  Nahum D. Melumad,et al.  Communication in settings with no transfers , 1991 .

[25]  Bruno H. Strulovici Learning While Voting: Determinants of Collective Experimentation , 2010 .

[26]  Sven Rady,et al.  Negatively Correlated Bandits , 2008 .

[27]  Florian Ederer,et al.  Delay and Deadlines: Freeriding and Information Revelation in Partnerships , 2013 .

[28]  Pauli Murto,et al.  Delay and information aggregation in stopping games with private information , 2013, J. Econ. Theory.

[29]  Daria Khromenkova Collective Experimentation with Breakdowns and Breakthroughs ∗ , 2015 .

[30]  Martin W. Cripps,et al.  Strategic experimentation in queues , 2019, Theoretical Economics.

[31]  Johannes Horner,et al.  Collaborating , 2009 .

[32]  Kaustav Das The Role of Heterogeneity in a Model of Strategic Experimentation , 2015 .

[33]  Ufuk Akcigit,et al.  The Role of Information in Innovation and Competition , 2014 .

[34]  Johannes Hörner,et al.  Learning to Disagree in a Game of Experimentation , 2015, J. Econ. Theory.

[35]  Nicolas Vieille,et al.  On Games of Strategic Experimentation , 2013, Games Econ. Behav..

[36]  Kostas Bimpikis,et al.  Designing Dynamic Contests , 2015, EC.

[37]  Steven Callander,et al.  Searching and Learning by Trial and Error , 2011 .

[38]  Godfrey Keller,et al.  Breakdowns , 2012 .

[39]  S. Hart,et al.  Long Cheap Talk , 2003 .