TSI: An Ad Text Strength Indicator using Text-to-CTR and Semantic-Ad-Similarity

Coming up with effective ad text is a time consuming process, and particularly challenging for small businesses with limited advertising experience. When an inexperienced advertiser onboards with a poorly written ad text, the ad platform has the opportunity to detect low performing ad text, and provide improvement suggestions. To realize this opportunity, we propose an ad text strength indicator (TSI) which: (i) predicts the click-through-rate (CTR) for an input ad text, (ii) fetches similar existing ads to create a neighborhood around the input ad, (iii) and compares the predicted CTRs in the neighborhood to declare whether the input ad is strong or weak. In addition, as suggestions for ad text improvement, TSI shows anonymized versions of superior ads (higher predicted CTR) in the neighborhood. For (i), we propose a BERT based text-to-CTR model trained on impressions and clicks associated with an ad text. For (ii), we propose a sentence-BERT based semantic-ad-similarity model trained using weak labels from ad campaign setup data. Offline experiments demonstrate that our BERT based text-to-CTR model achieves a significant lift in CTR prediction AUC for cold start (new) advertisers compared to bag-of-words based baselines. In addition, our semantic-textual-similarity model for similar ads retrieval achieves a precision@1 of 0.93 (for retrieving ads from the same product category); this is significantly higher compared to unsupervised TF-IDF, word2vec, and sentence-BERT baselines. Finally, we share promising online results from advertisers in the Yahoo (Verizon Media) ad platform where a variant of TSI was implemented with sub-second end-to-end latency.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  Martin Wattenberg,et al.  Ad click prediction: a view from the trenches , 2013, KDD.

[3]  Yue Wang,et al.  A Deep Top-K Relevance Matching Model for Ad-hoc Retrieval , 2018, CCIR.

[4]  Ravi Kant,et al.  A Large Scale Prediction Engine for App Install Clicks and Conversions , 2017, CIKM.

[5]  Yu Sun,et al.  ERNIE: Enhanced Representation through Knowledge Integration , 2019, ArXiv.

[6]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[7]  Wei Zhao,et al.  Deep Reinforcement Learning for Sponsored Search Real-time Bidding , 2018, KDD.

[8]  W. Beyer CRC Standard Probability And Statistics Tables and Formulae , 1990 .

[9]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[10]  Narayan Bhamidipati,et al.  Understanding Consumer Journey using Attention based Recurrent Neural Networks , 2019, KDD.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Adriana Kovashka,et al.  ADVISE: Symbolism and External Knowledge for Decoding Advertisements , 2017, ECCV.

[13]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[14]  Mingda Zhang,et al.  Automatic Understanding of Image and Video Advertisements , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[16]  Manisha Verma,et al.  Guiding creative design in online advertising , 2019, RecSys.

[17]  M. Kendall The treatment of ties in ranking problems. , 1945, Biometrika.

[18]  Wei Li,et al.  Exploitation and exploration in a performance based contextual advertising system , 2010, KDD.

[19]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[20]  Maxim Sviridenko,et al.  VisualTextRank: Unsupervised Graph-based Content Extraction for Automating Ad Text to Image Search , 2021, KDD.

[21]  Manisha Verma,et al.  Recommending Themes for Ad Creative Design via Visual-Linguistic Representations , 2020, WWW.