Where to Sell: Simulating Auctions From Learning Algorithms

Ad exchange platforms connect online publishers and advertisers and facilitate the sale of billions of impressions every day. We study these environments from the perspective of a publisher who wants to find the profit-maximizing exchange in which to sell his inventory. Ideally, the publisher would run an auction among exchanges. However, this is not usually possible due to practical business considerations. Instead, the publisher must send each impression to only one of the exchanges, along with an asking price. We model the problem as a variation of the multi-armed bandits problem in which exchanges (arms) can behave strategically in order to maximizes their own profit. We propose e mechanisms that find the best exchange with sub-linear regret and have desirable incentive properties.

[1]  S. Muthukrishnan,et al.  Ad Exchanges: Research Issues , 2009, WINE.

[2]  Amin Saberi,et al.  Dynamic Pay-Per-Action Mechanisms and Applications to Online Advertising , 2013, Oper. Res..

[3]  Aleksandrs Slivkins,et al.  Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[4]  Moshe Babaioff,et al.  Dynamic Pricing with Limited Supply , 2011, ACM Trans. Economics and Comput..

[5]  D. Bergemann,et al.  The Dynamic Pivot Mechanism , 2008 .

[6]  Mehryar Mohri,et al.  Revenue Optimization in Posted-Price Auctions with Strategic Buyers , 2014, ArXiv.

[7]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[8]  S. Kakade,et al.  Optimal Dynamic Mechanism Design and the Virtual Pivot Mechanism , 2013 .

[9]  Yashodhan Kanoria,et al.  Incentive-Compatible Learning of Reserve Prices for Repeated Auctions , 2014, WINE.

[10]  Mehryar Mohri,et al.  Optimal Regret Minimization in Posted-Price Auctions with Strategic Buyers , 2014, NIPS.

[11]  Éva Tardos,et al.  Econometrics for Learning Agents , 2015, EC.

[12]  Zizhuo Wang,et al.  Close the Gaps: A Learning-While-Doing Algorithm for Single-Product Revenue Management Problems , 2014, Oper. Res..

[13]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[14]  L. Elisa Celis,et al.  Buy-It-Now or Take-a-Chance: Price Discrimination through Randomized Auctions , 2011, Manag. Sci..

[15]  Mehryar Mohri,et al.  Learning Theory and Algorithms for revenue optimization in second price auctions with reserve , 2013, ICML.

[16]  Éva Tardos,et al.  No-Regret Learning in Bayesian Games , 2015, NIPS.

[17]  Umar Syed,et al.  Learning Prices for Repeated Auctions with Strategic Buyers , 2013, NIPS.

[18]  Moshe Babaioff,et al.  Characterizing truthful multi-armed bandit mechanisms: extended abstract , 2009, EC '09.

[19]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[20]  R. McAfee,et al.  An overview of practical exchange design , 2012 .

[21]  Nikhil R. Devanur,et al.  The price of truthfulness for pay-per-click auctions , 2009, EC '09.

[22]  Claudio Gentile,et al.  Regret Minimization for Reserve Prices in Second-Price Auctions , 2015, IEEE Trans. Inf. Theory.

[23]  Moshe Babaioff,et al.  Characterizing truthful multi-armed bandit mechanisms: extended abstract , 2008, EC '09.

[24]  Aleksandrs Slivkins,et al.  25th Annual Conference on Learning Theory The Best of Both Worlds: Stochastic and Adversarial Bandits , 2022 .

[25]  Nikhil R. Devanur,et al.  Bandits with concave rewards and convex knapsacks , 2014, EC.

[26]  Amin Saberi,et al.  Dynamic cost-per-action mechanisms and applications to online advertising , 2008, WWW.

[27]  Omar Besbes,et al.  Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[28]  Frank Thomson Leighton,et al.  The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..