Targeting Optimization for Internet Advertising by Learning from Logged Bandit Feedback

In the last two decades, online advertising has become the most effective way to sponsor a product or an event. The success of this advertising format is mainly due to the capability of the Internet channels to reach a broad audience and to target different groups of users with specific sponsored announces. This is of paramount importance for media agencies, companies whose primary goal is to design ad campaigns that target only those users who are interested in the sponsored product, thus avoiding unnecessary costs due to the display of ads to uninterested users. In the present work, we develop an automatic method to find the best user targets (a.k.a. contexts) that a media agency can use in a given Internet advertising campaign. More specifically, we formulate the problem of target optimization as a Learning from Logged Bandit Feedback (LLBF) problem, and we propose the TargOpt algorithm, which uses a tree expansion of the target space to learn the partition that efficiently maximizes the campaign revenue. Furthermore, since the problem of finding the optimal target is intrinsically exponential in the number of the features, we propose a tree-search method, called A-TargOpt, and two heuristics to drive the tree expansion, aiming at providing an anytime solution. Finally, we present empirical evidence, on both synthetically generated and real-world data, that our algorithms provide a practical solution to find effective targets for Internet advertising.

[1]  Foster J. Provost,et al.  Machine learning for targeted display advertising: transfer learning in action , 2013, Machine Learning.

[2]  Marcello Restelli,et al.  Risk-averse trees for learning from logged bandit feedback , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[3]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[4]  Jian Xu,et al.  Smart Pacing for Effective Online Ad Campaign Optimization , 2015, KDD.

[5]  Tie-Yan Liu,et al.  Joint optimization of bid and budget allocation in sponsored search , 2012, KDD.

[6]  Foster J. Provost,et al.  Bid optimizing and inventory scoring in targeted online advertising , 2012, KDD.

[7]  Foster Provost,et al.  Audience selection for on-line brand advertising: privacy-friendly social network targeting , 2009, KDD.

[8]  Thorsten Joachims,et al.  Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..

[9]  Marcello Restelli,et al.  A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns , 2018, AAAI.

[10]  Foster J. Provost,et al.  Design principles of massive, robust prediction systems , 2012, KDD.

[11]  Nikolay Archak,et al.  Budget Optimization for Online Advertising Campaigns with Carryover Effects , 2010 .

[12]  Wen Zhang,et al.  How much can behavioral targeting help online advertising? , 2009, WWW '09.

[13]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[14]  Michalis Vazirgiannis,et al.  Toward an integrated framework for automated development and optimization of online advertising campaigns , 2014, Intell. Data Anal..

[15]  Alessandro Lazaric,et al.  Risk-Aversion in Multi-armed Bandits , 2012, NIPS.

[16]  Sahin Cem Geyik,et al.  Multi-Touch Attribution Based Budget Allocation in Online Advertising , 2014, ADKDD'14.