A truthful learning mechanism for contextual multi-slot sponsored search auctions with externalities

Sponsored search auctions constitute one of the most successful applications of microeconomic mechanisms. In mechanism design, auctions are usually designed to incentivize advertisers to bid their truthful valuations and, at the same time, to assure both the advertisers and the auctioneer a non--negative utility. Nonetheless, in sponsored search auctions, the click-through-rates (CTRs) of the advertisers are often unknown to the auctioneer and thus standard incentive compatible mechanisms cannot be directly applied and must be paired with an effective learning algorithm for the estimation of the CTRs. This introduces the critical problem of designing a learning mechanism able to estimate the CTRs as the same time as implementing a truthful mechanism with a revenue loss as small as possible compared to an optimal mechanism designed with the true CTRs. Previous works showed that in single-slot auctions the problem can be solved using a suitable exploration-exploitation mechanism able to achieve a per-step regret of order O(T-1/3) (where T is the number of times the auction is repeated). In this paper we extend these results to the general case of contextual multi-slot auctions with position- and ad-dependent externalities. In particular, we prove novel upper-bounds on the revenue loss w.r.t. to a VCG auction and we report numerical simulations investigating their accuracy in predicting the dependency of the regret on the number of rounds T, the number of slots K, and the number of advertisements n.

[1]  A. Mas-Colell,et al.  Microeconomic Theory , 1995 .

[2]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[3]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[4]  Sandeep Pandey,et al.  Handling Advertisements of Unknown Quality in Search Advertising , 2006, NIPS.

[5]  Shie Mannor,et al.  Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..

[6]  Rica Gonen,et al.  An Adaptive Sponsored Search Mechanism delta -Gain Truthful in Valuation, Time, and Budget , 2007, WINE.

[7]  Rica Gonen,et al.  An incentive-compatible multi-armed bandit mechanism , 2007, PODC '07.

[8]  An Adaptive Sponsored Search Mechanism δ-Gain Truthful in Valuation , Time , and Budget , 2007 .

[9]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[10]  Mohammad Mahdian,et al.  A Cascade Model for Externalities in Sponsored Search , 2008, WINE.

[11]  Jon Feldman,et al.  Sponsored Search Auctions with Markovian Users , 2008, WINE.

[12]  Amin Saberi,et al.  Dynamic cost-per-action mechanisms and applications to online advertising , 2008, WWW.

[13]  John Langford,et al.  Maintaining Equilibria During Exploration in Sponsored Search Auctions , 2010, Algorithmica.

[14]  Nikhil R. Devanur,et al.  The price of truthfulness for pay-per-click auctions , 2009, EC '09.

[15]  Y. Narahari,et al.  Game Theoretic Problems in Network Economics and Mechanism Design Solutions , 2009, Advanced Information and Knowledge Processing.

[16]  Aleksandrs Slivkins,et al.  Monotone multi-armed bandit allocations , 2011, COLT.

[17]  A. Lazaric,et al.  A Truthful Learning Mechanism for Contextual Multi – , 2012 .

[18]  Moshe Babaioff,et al.  Characterizing truthful multi-armed bandit mechanisms: extended abstract , 2008, EC '09.