Strategic Experimentation: A Revision

This paper extends the classic two-armed bandit problem to a many-agent setting in which I players each face the same experi- mentation problem.The main change from the single-agent prob- lem is that an agent can now learn from the current experimentation of other agents.Information is therefore a public good, and a free- rider problem in experimentation naturally arises.More interestingly, the prospect of future experimentation by others encourages agents to increase current experimentation, in order to bring forward the time at which the extra information generated by such experimenta- tion becomes available.The paper provides an analysis of the set of stationary Markov equilibria in terms of the free-rider e ect and the encouragement e ect.The paper is a revision of our earlier paper, Bolton and Harris [7].The main modi cation concerns the formulation of randomization in continuous time.C.f.Harris [12].The earlier paper explored one formulation based on the idea of rapid alternation over the state space.The current paper explores a formulation which is the closest analogue of the discrete-time formulation.It is based on the idea of randomization at each instant of time.

[1]  Douglas Gale,et al.  Information Revelation and Strategic Delay in a Model of Investment , 1994 .

[2]  B. Jullien,et al.  OPTIMAL LEARNING BY EXPERIMENTATION , 1991 .

[3]  S. Bikhchandani,et al.  You have printed the following article : A Theory of Fads , Fashion , Custom , and Cultural Change as Informational Cascades , 2007 .

[4]  Rafael Rob,et al.  Learning and Capacity Expansion under Demand Uncertainty , 1991 .

[5]  A. Banerjee,et al.  A Simple Model of Herd Behavior , 1992 .

[6]  Generalized Solutions of Stochastic Differential Games in One Dimension , 1993 .

[7]  Dan Kovenock,et al.  Asymmetric Information, Information Externalities, and Efficiency: The Case of Oil Exploration , 1989 .

[8]  M. Freidlin Functional Integration And Partial Differential Equations , 1985 .

[9]  Frank Spitzer Optimal Stopping Rules (A. N. Shiryayev) , 1981 .

[10]  Glenn Ellison,et al.  Rules of Thumb for Social Learning , 1993, Journal of Political Economy.

[11]  M. Rothschild A two-armed bandit theory of market pricing , 1974 .

[12]  Larry Samuelson,et al.  Sequential Research and the Adoption of Innovations , 1986 .

[13]  Alʹbert Nikolaevich Shiri︠a︡ev,et al.  Statistics of random processes , 1977 .

[14]  A. McLennan Price dispersion and incomplete learning in the long run , 1984 .

[15]  N. Kiefer,et al.  Controlling a Stochastic Process with Unknown Parameters , 1988 .

[16]  Leonard J. Mirman,et al.  Duopoly signal jamming , 1993 .

[17]  I. Karatzas Gittins Indices in the Dynamic Allocation Problem for Diffusion Processes , 1984 .

[18]  B. Jullien,et al.  Dynamic duopoly with learning through market experimentation , 1993 .