Efficient Metadeliberation Auctions

Imagine a resource allocation scenario in which the interested parties can, at a cost, individually research ways of using the resource to be allocated, potentially increasing the value they would achieve from obtaining it. Each agent has a private model of its research process and obtains a private realization of its improvement in value, if any. From a social perspective it is optimal to coordinate research in a way that strikes the right tradeoff between value and cost, ultimately allocating the resource to one party- thus this is a problem of multi-agent metadeliberation. We provide a reduction of computing the optimal deliberation-allocation policy to computing Gittins indices in multi-anned bandit worlds, and apply a modification of the dynamic-VCG mechanism to yield truthful participation in an ex post equilibrium. Our mechanism achieves equilibrium implementation ofthe optimal policy even when agents have the capacity to deliberate about other agents' valuations, and thus addresses the problem of strategic deliberation.

[1]  David C. Parkes,et al.  Auction design with costly preference elicitation , 2005, Annals of Mathematics and Artificial Intelligence.

[2]  K. Glazebrook Stoppable families of alternative bandit processes , 1979 .

[3]  Michael N. Katehakis,et al.  The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..

[4]  Dirk Bergemann,et al.  Information Acquisition and Efficient Mechanism Design , 2000 .

[5]  Christian M. Ernst,et al.  Multi-armed Bandit Allocation Indices , 1989 .

[6]  D. Bergemann,et al.  Efficient Dynamic Auctions , 2006 .

[7]  David C. Parkes,et al.  Optimal Coordinated Planning Amongst Self-Interested Agents with Private State , 2006, UAI.

[8]  J. Crémer,et al.  Auctions with costly information acquisition , 2007 .

[9]  Tuomas Sandholm,et al.  Mechanism design and deliberative agents , 2005, AAMAS '05.

[10]  David C. Parkes,et al.  Efficient Online Mechanisms for Persistent, Periodically Inaccessible Self-Interested Agents , 2007 .

[11]  R Bellman,et al.  A MATHEMATICAL THEORY OF ADAPTIVE CONTROL PROCESSES. , 1959, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Kate Larson,et al.  Reducing costly information acquisition in auctions , 2006, AAMAS '06.

[13]  T. Sandholm,et al.  Costly valuation computation in auctions , 2001 .

[14]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[15]  M. Weitzman Optimal search for the best alternative , 1978 .

[16]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[17]  Anthony Man-Cho So,et al.  Mechanism design for stochastic optimization problems , 2007, SECO.