Social Learning and the Innkeeper's Challenge

Technological evolution, so central to the progress of humanity in recent decades, is the process of constantly introducing new technologies to replace old ones. A new technology does not necessarily mean a better technology and so should not always be embraced. How can society learn which novelties present actual improvements over the existing technology? Whereas the quality of status-quo technology is well known, the new one is a pig in a poke. With sufficiently many individuals willing to explore the new technology society can learn whether it is indeed an improvement. However, self motivated agents, often, do not agree to explore. This is true, in particular, if agents observed some predecessors that were disappointed from the new technology. Inspired by the classical multi-armed bandit model we study a setting where agents arrive sequentially and must pull one of two arms in order to receive a reward - a risky arm (representing the new technology) and a safe arm (representing the existing one). A central planner must induce sufficiently many agents to experiment with the risky arm. The central planner observes the actions and rewards of all agents while the agents themselves have partial observation. For the setting where each agent observes his predecessor we provide the central planner with a recommendation algorithm that is (almost) incentive compatible and facilitates social learning.

[1]  Richard Schmalensee,et al.  Handbook of Industrial Organization , 1989 .

[2]  Bangrui Chen,et al.  Incentivizing Exploration by Heterogeneous Users , 2018, COLT.

[3]  Yishay Mansour,et al.  Bayesian Incentive-Compatible Bandit Exploration , 2018 .

[4]  Moshe Tennenholtz,et al.  Economic Recommendation Systems: One Page Abstract , 2016, EC.

[5]  Douglas Gale,et al.  Bayesian learning in social networks , 2003, Games Econ. Behav..

[6]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[7]  R. Myerson Optimal coordination mechanisms in generalized principal–agent problems , 1982 .

[8]  Miguel A. Ballester,et al.  On the Justice of Decision Rules , 2011 .

[9]  Jon M. Kleinberg,et al.  Incentivizing exploration , 2014, EC.

[10]  Yishay Mansour,et al.  Implementing the “Wisdom of the Crowd” , 2013, Journal of Political Economy.

[11]  S. Bikhchandani,et al.  You have printed the following article : A Theory of Fads , Fashion , Custom , and Cultural Change as Informational Cascades , 2007 .

[12]  Nicole Immorlica,et al.  Incentivizing Exploration with Unbiased Histories , 2018, ArXiv.

[13]  Yeon-Koo Che,et al.  Optimal Design for Social Learning , 2015 .

[14]  A. Banerjee,et al.  A Simple Model of Herd Behavior , 1992 .

[15]  Mark Armstrong,et al.  Preface: Handbook of Industrial Organization: Volume 3 , 2007 .

[16]  Yishay Mansour,et al.  Bayesian Exploration: Incentivizing Exploration in Bayesian Games , 2016, EC.