Stochastic bandits for multi-platform budget optimization in online advertising

We study the problem of an online advertising system that wants to optimally spend an advertiser’s given budget for a campaign across multiple platforms, without knowing the value for showing an ad to the users on those platforms. We model this challenging practical application as a Stochastic Bandits with Knapsacks problem over T rounds of bidding with the set of arms given by the set of distinct bidding m-tuples, where m is the number of platforms. We modify the algorithm proposed in Badanidiyuru et al., [11] to extend it to the case of multiple platforms to obtain an algorithm for both the discrete and continuous bid-spaces. Namely, for discrete bid spaces we give an algorithm with regret , where OPT is the performance of the optimal algorithm that knows the distributions. For continuous bid spaces the regret of our algorithm is . When restricted to this special-case, this bound improves over Sankararaman and Slivkins [34] in the regime OPT < < T, as is the case in the particular application at hand. Second, we show an lower bound for the discrete case and an Ω(m1/3B2/3) lower bound for the continuous setting, almost matching the upper bounds. Finally, we use a real-world data set from a large internet online advertising company with multiple ad platforms and show that our algorithms outperform common benchmarks and satisfy the required properties warranted in the real-world application.

[1]  Yossi Azar,et al.  On Revenue Maximization in Second-Price Ad Auctions , 2009, ESA.

[2]  Archie C. Chapman,et al.  Knapsack Based Optimal Policies for Budget-Limited Multi-Armed Bandits , 2012, AAAI.

[3]  Nicole Immorlica,et al.  Dynamics of bid optimization in online advertisement auctions , 2007, WWW '07.

[4]  Donglin Zeng,et al.  Estimating marginal survival function by adjusting for dependent censoring using many covariates , 2004, math/0409180.

[5]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[6]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[7]  Nikhil R. Devanur,et al.  Linear Contextual Bandits with Knapsacks , 2015, NIPS.

[8]  Nikhil R. Devanur,et al.  Bandits with concave rewards and convex knapsacks , 2014, EC.

[9]  Zhe Feng,et al.  Online Learning for Measuring Incentive Compatibility in Ad Auctions? , 2019, WWW.

[10]  Christian Kroer,et al.  Contextual First-Price Auctions with Budgets , 2021, ArXiv.

[11]  Vijay Kamble Revenue Management on an On-Demand Service Platform , 2018, WINE.

[12]  Claire Mathieu,et al.  Greedy bidding strategies for keyword auctions , 2007, EC '07.

[13]  Jon Feldman,et al.  Budget optimization in search-based advertising auctions , 2006, EC '07.

[14]  Vincent Conitzer,et al.  Pacing Equilibrium in First-Price Auction Markets , 2018, EC.

[15]  Nicholas R. Jennings,et al.  Efficient Regret Bounds for Online Bid Optimisation in Budget-Limited Sponsored Search Auctions , 2014, UAI.

[16]  Aleksandrs Slivkins,et al.  Combinatorial Semi-Bandits with Knapsacks , 2017, AISTATS.

[17]  Nikhil R. Devanur,et al.  An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives , 2015, COLT.

[18]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[19]  Aleksandrs Slivkins,et al.  Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..

[20]  Nicole Immorlica,et al.  Adversarial Bandits with Knapsacks , 2018, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[21]  Riccardo Colini-Baldeschi,et al.  Equilibria in Auctions with Ad Types , 2021, WWW.

[22]  John Langford,et al.  Resourceful Contextual Bandits , 2014, COLT.

[23]  David P. Williamson,et al.  An adaptive algorithm for selecting profitable keywords for search-based advertising services , 2006, EC '06.

[24]  Aleksandrs Slivkins,et al.  Bandits with Knapsacks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[25]  Vasilis Syrgkanis,et al.  Learning to Bid Without Knowing your Value , 2017, EC.

[26]  Robert D. Kleinberg,et al.  Regret bounds for sleeping experts and bandits , 2010, Machine Learning.

[27]  Kartik Hosanagar,et al.  Optimal bidding in stochastic budget constrained slot auctions , 2008, EC '08.

[28]  Vahab S. Mirrokni,et al.  Budget Management Strategies in Repeated Auctions , 2016, WWW.

[29]  Sujin Kim,et al.  The stochastic root-finding problem: Overview, solutions, and open questions , 2011, TOMC.

[30]  Nicola Gatti,et al.  Online Joint Bid/Daily Budget Optimization of Internet Advertising Campaigns , 2020, Artif. Intell..

[31]  Patrick Jaillet,et al.  Real-Time Bidding with Side Information , 2017, NIPS.

[32]  Christopher A. Wilkens,et al.  The Ad Types Problem , 2019, WINE.

[33]  Vashist Avadhanula,et al.  On the tightness of an LP relaxation for rational optimization and its applications , 2016, Oper. Res. Lett..

[34]  Ashish Goel,et al.  Advertisement allocation for generalized second-pricing schemes , 2010, Oper. Res. Lett..

[35]  Anton Schwaighofer,et al.  Budget Optimization for Sponsored Search: Censored Learning in MDPs , 2012, UAI.

[36]  SaberiAmin,et al.  AdWords and generalized online matching , 2007 .

[37]  Aranyak Mehta,et al.  Optimizing budget constrained spend in search advertising , 2013, WSDM '13.

[38]  S. Sathiya Keerthi,et al.  Ad Delivery with Budgeted Advertisers: A Comprehensive LP Approach , 2008 .

[39]  Wtt Wtt Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits , 2015 .