Efficient Dynamic Allocation with Uncertain Valuations

In this paper we consider the problem of efficiently allocating a given resource or object repeatedly over time. The agents, who may temporarily receive access to the resource, learn more about its value through its use. When the agents' beliefs about their valuations at any given time are public information, this problem reduces to the classic multi-armed bandit problem, the solution to which is obtained by determining a Gittins index for every agent. In the setting we study, agents observe their valuations privately, and the efficient dynamic resource allocation problem under asymmetric information becomes a problem of truthfully eliciting every agent's Gittins index. We introduce two bounding mechanisms, under which agents announce types corresponding to Gittins indices either at least as high or at most as high as their true Gittins indices. Using an announcement-contingent affine combination of the bounding mechanisms it is possible to implement the efficient dynamic allocation policy. We provide necessary and sufficient conditions for global Bayesian incentive compatibility, guaranteeing a truthful efficient allocation of the resource. Using essentially the same method it is possible to approximately implement truthful mechanisms corresponding to a large variety of surplus distribution objectives the principal might have, for instance a dynamic second-price Gittins index auction, which maximizes the principal's revenue subject to implementing an efficient allocation policy.

[1]  D. Hilbert Ueber die stetige Abbildung einer Line auf ein Flächenstück , 1891 .

[2]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[3]  W. R. Thompson On the Theory of Apportionment , 1935 .

[4]  H Robbins,et al.  A SEQUENTIAL DECISION PROBLEM WITH A FINITE MEMORY. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[5]  R. N. Bradt,et al.  On Sequential Designs for Maximizing the Sum of $n$ Observations , 1956 .

[6]  N. G. Parke,et al.  Ordinary Differential Equations. , 1958 .

[7]  H. Raiffa,et al.  Applied Statistical Decision Theory. , 1961 .

[8]  William Vickrey,et al.  Counterspeculation, Auctions, And Competitive Sealed Tenders , 1961 .

[9]  Richard A. Silverman,et al.  Ordinary Differential Equations , 1968, The Mathematical Gazette.

[10]  M. Degroot Optimal Statistical Decisions , 1970 .

[11]  E. H. Clarke Multipart pricing of public goods , 1971 .

[12]  Theodore Groves,et al.  Incentives in Teams , 1973 .

[13]  A. Gibbard Manipulation of Voting Schemes: A General Result , 1973 .

[14]  Gideon Weiss,et al.  Multiple feedback at a single server station , 1975, Advances in Applied Probability.

[15]  C. d'Aspremont,et al.  Incentives and incomplete information , 1979 .

[16]  K. Arrow The Property Rights Doctrine and Demand Revelation under Incomplete Information**This work was supported by National Science Foundation under Grant No. SOC75-21820 at the Institute for Mathematical Studies in the Social Sciences, Stanford University. , 1979 .

[17]  M. Weitzman Optimal search for the best alternative , 1978 .

[18]  R. Myerson Incentive Compatibility and the Bargaining Problem , 1979 .

[19]  Steven A. Lippman,et al.  The Economics of Belated Information , 1981 .

[20]  M. Satterthwaite,et al.  Efficient Mechanisms for Bilateral Trading , 1983 .

[21]  Peter B. Morgan Distributions of the Duration and Value of Job Search with Learning , 1985 .

[22]  E. Stacchetti,et al.  Optimal cartel equilibria with imperfect monitoring , 1986 .

[23]  A. Mandelbaum Discrete multi-armed bandits and multi-parameter processes , 1986 .

[24]  Dale T. Mortensen,et al.  Chapter 15 Job search and labor market analysis , 1986 .

[25]  Michael N. Katehakis,et al.  COMPUTING OPTIMAL SEQUENTIAL ALLOCATION RULES IN CLINICAL TRIALS , 1986 .

[26]  Michael N. Katehakis,et al.  The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..

[27]  P. Whittle Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.

[28]  Jean Walrand An introduction to queuing networks , 1988 .

[29]  Asha Sadanand,et al.  Probationary contracts in agencies with bilateral asymmetric information , 1989 .

[30]  J. Sztrik An introduction to queuing networks , 1990 .

[31]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[32]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[33]  R. Weber On the Gittins Index for Multiarmed Bandits , 1992 .

[34]  P. Reny,et al.  Correlated Information and Mechanism Design , 1992 .

[35]  D. Teneketzis,et al.  Optimality of index policies for stochastic scheduling with switching penalties , 1992, Journal of Applied Probability.

[36]  James Richardson PARALLEL SOURCING AND SUPPLIER PERFORMANCE IN THE JAPANESE AUTOMOBILE INDUSTRY , 1993 .

[37]  J. Tsitsiklis A short proof of the Gittins index theorem , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[38]  Dimitris Bertsimas,et al.  Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems , 2011, IPCO.

[39]  J. Banks,et al.  Switching Costs and the Gittins Index , 1994 .

[40]  D. Fudenberg,et al.  Digitized by the Internet Archive in 2011 with Funding from Working Paper Department of Economics the Folk Theorem with Imperfect Public Information , 2022 .

[41]  D. Bergemann,et al.  Learning and Strategic Pricing , 1996 .

[42]  Demosthenis Teneketzis,et al.  Multi-armed bandits with switching penalties , 1996, IEEE Trans. Autom. Control..

[43]  Dimitris Bertsimas,et al.  Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems , 1996, Math. Oper. Res..

[44]  David M. Kreps,et al.  Advances In Economics and Econometrics: Theory And Applications: Seventh World Congress , 1997 .

[45]  P. Courty,et al.  Sequential Screening , 1998 .

[46]  Susan Athey,et al.  Collusion and Price Rigidity , 1998 .

[47]  S. Athey,et al.  Optimal Collusion with Private Information , 1999 .

[48]  Klaus Adam Learning While Searching for the Best Alternative , 2001, J. Econ. Theory.

[49]  Luis Rayo Relational Team Incentives and Ownership , 2002 .

[50]  Optimal Information Disclosures in Auctions: The Handicap Auction , 2002 .

[51]  Jeffrey C. Ely,et al.  Ex-Post Incentive Compatible Mechanism Design , 2002 .

[52]  E. Maskin,et al.  Implementation Theory∗ , 2002 .

[53]  Jonathan Levin Relational Incentive Contracts , 2003 .

[54]  David A. Miller Attainable payoffs in repeated games with interdependent private information , 2005 .

[55]  Dirk Bergemann,et al.  Information in Mechanism Design , 2005 .

[56]  Susan Athey,et al.  Efficiency in repeated trade with hidden valuations , 2007 .

[57]  D. Martimort Multi-Contracting Mechanism Design 1 , 2006 .

[58]  Thomas A. Weber,et al.  Bayesian Incentive Compatible Parametrization of Mechanisms , 2006 .

[59]  Dirk Bergemann,et al.  Dynamic Price Competition , 2003, J. Econ. Theory.

[60]  David Martimort,et al.  Advances in Economics and Econometrics: Multi-Contracting Mechanism Design , 2006 .

[61]  Dirk Bergemann,et al.  Information Structures in Optimal Auctions , 2001, J. Econ. Theory.

[62]  Ilya Segal,et al.  An Efficient Dynamic Mechanism , 2013 .

[63]  H. Robbins Some aspects of the sequential design of experiments , 1952 .