Selling to a No-Regret Buyer

We consider the problem of a single seller repeatedly selling a single item to a single buyer (specifically, the buyer has a value drawn fresh from known distribution D in every round). Prior work assumes that the buyer is fully rational and will perfectly reason about how their bids today affect the seller's decisions tomorrow. In this work we initiate a different direction: the buyer simply runs a no-regret learning algorithm over possible bids. We provide a fairly complete characterization of optimal auctions for the seller in this domain. Specifically: - If the buyer bids according to EXP3 (or any "mean-based" learning algorithm), then the seller can extract expected revenue arbitrarily close to the expected welfare. This auction is independent of the buyer's valuation D , but somewhat unnatural as it is sometimes in the buyer's interest to overbid. - There exists a learning algorithm A such that if the buyer bids according to A then the optimal strategy for the seller is simply to post the Myerson reserve for D every round. - If the buyer bids according to EXP3 (or any "mean-based" learning algorithm), but the seller is restricted to "natural" auction formats where overbidding is dominated (e.g. Generalized First-Price or Generalized Second-Price), then the optimal strategy for the seller is a pay-your-bid format with decreasing reserves over time. Moreover, the seller's optimal achievable revenue is characterized by a linear program, and can be unboundedly better than the best truthful auction yet simultaneously unboundedly worse than the expected welfare.

[1]  William Vickrey,et al.  Counterspeculation, Auctions, And Competitive Sealed Tenders , 1961 .

[2]  Constantinos Daskalakis,et al.  Learning in Auctions: Regret is Hard, Envy is Easy , 2015, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[3]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[4]  Éva Tardos,et al.  Econometrics for Learning Agents , 2015, EC.

[5]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..

[6]  Haipeng Luo,et al.  Oracle-Efficient Online Learning and Auction Design , 2016, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[7]  John Langford,et al.  The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.

[8]  Richard Cole,et al.  The sample complexity of revenue maximization , 2014, STOC.

[9]  Éva Tardos,et al.  Composable and efficient mechanisms , 2012, STOC '13.

[10]  S. Matthew Weinberg,et al.  Symmetries and optimal multi-dimensional mechanism design , 2012, EC '12.

[11]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[12]  Yuval Peres,et al.  Perfect Bayesian Equilibria in Repeated Sales , 2014, SODA.

[13]  Renato Paes Leme,et al.  Optimal dynamic mechanisms with ex-post IR via bank accounts , 2016, ArXiv.

[14]  Yishay Mansour,et al.  From External to Internal Regret , 2005, J. Mach. Learn. Res..

[15]  E. H. Clarke Multipart pricing of public goods , 1971 .

[16]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[17]  Christos H. Papadimitriou,et al.  On the Complexity of Dynamic Mechanism Design , 2014, SODA.

[18]  Nicole Immorlica,et al.  Repeated Sales with Multiple Strategic Buyers , 2017, EC.

[19]  Adam Tauman Kalai,et al.  Geometric algorithms for online optimization , 2002 .

[20]  Yang Cai,et al.  Learning Multi-Item Auctions with (or without) Samples , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[21]  Roger B. Myerson,et al.  Optimal Auction Design , 1981, Math. Oper. Res..

[22]  Theodore Groves,et al.  Incentives in Teams , 1973 .

[23]  Tim Roughgarden,et al.  The price of anarchy in games of incomplete information , 2012, SECO.

[24]  Haipeng Luo,et al.  Oracle-Efficient Learning and Auction Design , 2016, ArXiv.

[25]  Tim Roughgarden,et al.  Learning Simple Auctions , 2016, COLT.

[26]  Nikhil R. Devanur,et al.  The sample complexity of auctions with side information , 2015, STOC.

[27]  Yannai A. Gonczarowski,et al.  Efficient empirical revenue maximization in single-parameter auction environments , 2016, STOC.

[28]  Itai Ashlagi,et al.  Sequential Mechanisms with Ex-post Participation Guarantees , 2016, EC.

[29]  Siqi Liu,et al.  On the Competition Complexity of Dynamic Mechanism Design , 2018, SODA.

[30]  Tim Roughgarden,et al.  The Pseudo-Dimension of Near-Optimal Auctions , 2015, NIPS 2015.

[31]  J. Langford,et al.  The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.

[32]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[33]  Renato Paes Leme,et al.  Dynamic Auctions with Bank Accounts , 2016, IJCAI.