Importance of Recommendation Policy Space in Addressing Click Sparsity in Personalized Advertisement Display

We study a specific case of personalized advertisement recommendation (PAR) problem, which consist of a user visiting a system (website) and the system displaying one of K ads to the user. The system uses an internal ad recommendation policy to map the user’s profile (context) to one of the ads. The user either clicks or ignores the ad and correspondingly, the system updates its recommendation policy. The policy space in large scale PAR systems are generally based on classifiers. A practical problem in PAR is extreme click sparsity, due to very few users actually clicking on ads. We systematically study the drawback of using classifier-based policies, in face of extreme click sparsity. We then suggest an alternate policy, based on rankers, learnt by optimizing the Area Under the Curve (AUC) ranking loss, which can significantly alleviate the problem of click sparsity. We create deterministic and stochastic policy spaces and conduct extensive experiments on public and proprietary datasets to illustrate the improvement in click-through-rate (CTR) obtained by using the ranker-based policy over classifier-based policy.

[1]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[2]  Szymon Jaroszewicz,et al.  Efficient AUC Optimization for Classification , 2007, PKDD.

[3]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[4]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[5]  Neha Gupta,et al.  An Empirical Evaluation of Ensemble Decision Trees to Improve Personalization on Advertisement , 2014 .

[6]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[7]  John Langford,et al.  Efficient Optimal Learning for Contextual Bandits , 2011, UAI.

[8]  Philip S. Thomas,et al.  Ad Recommendation Systems for Life-Time Value Optimization , 2015, WWW.

[9]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[10]  Evgeniy Gabrilovich,et al.  Translating relevance scores to probabilities for contextual advertising , 2009, CIKM.

[11]  Rong Jin,et al.  Online AUC Maximization , 2011, ICML.

[12]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[13]  John Langford,et al.  The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.

[14]  Martin Wattenberg,et al.  Ad click prediction: a view from the trenches , 2013, KDD.

[15]  Ohad Shamir,et al.  Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.

[16]  John Langford,et al.  Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.

[17]  Dazhe Zhao,et al.  An Optimized Cost-Sensitive SVM for Imbalanced Data Learning , 2013, PAKDD.

[18]  Yanli Wang,et al.  FSelector: a Ruby gem for feature selection , 2012, Bioinform..

[19]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[20]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[21]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.