Evaluating and Optimizing Online Advertising: Forget the Click, but There Are Good Proxies

Online systems promise to improve advertisement targeting via the massive and detailed data available. However, there often is too few data on exactly the outcome of interest, such as purchases, for accurate campaign evaluation and optimization (due to low conversion rates, cold start periods, lack of instrumentation of offline purchases, and long purchase cycles). This paper presents a detailed treatment of proxy modeling, which is based on the identification of a suitable alternative (proxy) target variable when data on the true objective is in short supply (or even completely nonexistent). The paper has a two-fold contribution. First, the potential of proxy modeling is demonstrated clearly, based on a massive-scale experiment across 58 real online advertising campaigns. Second, we assess the value of different specific proxies for evaluating and optimizing online display advertising, showing striking results. The results include bad news and good news. The most commonly cited and used proxy is a click on an ad. The bad news is that across a large number of campaigns, clicks are not good proxies for evaluation or for optimization: clickers do not resemble buyers. The good news is that an alternative sort of proxy performs remarkably well: observed visits to the brand's website. Specifically, predictive models built based on brand site visits-which are much more common than purchases-do a remarkably good job of predicting which browsers will make a purchase. The practical bottom line: evaluating and optimizing campaigns using clicks seems wrongheaded; however, there is an easy and attractive alternative-use a well-chosen site-visit proxy instead.

[1]  Foster J. Provost,et al.  Machine learning for targeted display advertising: transfer learning in action , 2013, Machine Learning.

[2]  Gian M. Fulgoni,et al.  Whither the Click? How Online Advertising Works , 2009, Journal of Advertising Research.

[3]  Sandeep Pandey,et al.  Learning to target: what works for behavioral targeting , 2011, CIKM '11.

[4]  Naveen Donthu,et al.  The Impact Of Content And Design Elements On Banner Advertising Click-Through Rates , 2003 .

[5]  Tom Fawcett,et al.  Data science for business , 2013 .

[6]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[7]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[8]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[9]  David H. Reiley,et al.  Display advertising impact: search lift and social influence , 2011, KDD.

[10]  L. Ryd,et al.  On bias. , 1994, Acta orthopaedica Scandinavica.

[11]  Foster Provost,et al.  Audience selection for on-line brand advertising: privacy-friendly social network targeting , 2009, KDD.

[12]  Roger M. Stein Benchmarking default prediction models: pitfalls and remedies in model validation , 2007 .

[13]  Foster Provost,et al.  Causally motivated attribution for online advertising , 2012, ADKDD '12.

[14]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[15]  Foster J. Provost,et al.  Bid optimizing and inventory scoring in targeted online advertising , 2012, KDD.

[16]  Tom Fawcett,et al.  PAV and the ROC convex hull , 2007, Machine Learning.

[17]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[18]  Anindya Ghose,et al.  An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets , 2009, Manag. Sci..

[19]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[20]  David Madigan,et al.  Algorithms for Sparse Linear Classifiers in the Massive Data Setting , 2008, J. Mach. Learn. Res..