Missing Click History in Sponsored Search : A Generative Modeling Solution

A fundamental problem in sponsored search advertising is the estimation of probability of click for ads displayed in response to search queries. The historical click-through rate (CTR) is one of the most important predictors of the click, and extracted at multiple resolutions of the query-ad hierarchy. However, the new ads do not have any click history, and even the existing ads might miss history at some resolutions due to, for example, tail queries. In addition to a loss in accuracy, the missing features introduce significant complexity in designing conditional probability of click models such as the maximum-entropy model. In this paper, we develop a generative modeling solution to handle missing features in the maximum-entropy and other conditional models. In particular, a mixture of multivariate Gaussian distributions is used to learn a representation of the CTR features. This mixture model then provides information about the missing features to the maximum-entropy model in increasing degrees of sophistication, ranging from the pointwise estimates of the missing features to multi-way interaction terms and to novel posterior features. We show the utility of this approach for sponsored click prediction using the click-view data collected from Yahoo! search engine. We find that the generative modeling approach not only improves click prediction accuracy over a state-of-the-art system, but also results in a significantly less complex system.

[1]  Hema Raghavan,et al.  Improving ad relevance in sponsored search , 2010, WSDM '10.

[2]  Tasos Anastasakos,et al.  A collaborative filtering approach to ad recommendation using the query-ad click graph , 2009, CIKM.

[3]  Rukmini Iyer,et al.  Data-driven text features for sponsored search click prediction , 2009, KDD Workshop on Data Mining and Audience Intelligence for Advertising.

[4]  Torsten Suel,et al.  Modeling and predicting user behavior in sponsored search , 2009, KDD.

[5]  D. Sculley,et al.  Predicting bounce rates in sponsored search advertisements , 2009, KDD.

[6]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[7]  Xiaojie Yuan,et al.  Are click-through data adequate for learning web search rankings? , 2008, CIKM '08.

[8]  Filip Radlinski,et al.  Optimizing relevance and revenue in ad search: a query substitution approach , 2008, SIGIR '08.

[9]  Deepayan Chakrabarti,et al.  Contextual advertising by combining relevance with click feedback , 2008, WWW.

[10]  Vassilis Plachouras,et al.  Online learning from click data for sponsored search , 2008, WWW.

[11]  Wei Vivian Zhang,et al.  Geographic intention and modification in web search , 2008, Int. J. Geogr. Inf. Sci..

[12]  J. Bilmes Gaussian Models in Automatic Speech Recognition , 2008 .

[13]  Hema Raghavan Evaluating Vector-Space and Probabilistic Models for Query to Ad Matching , 2008 .

[14]  Andrei Z. Broder,et al.  A semantic approach to contextual advertising , 2007, SIGIR.

[15]  Xiaofei He,et al.  Query rewriting using active learning for sponsored search , 2007, SIGIR.

[16]  Charles L. A. Clarke,et al.  The influence of caption features on clickthrough patterns in web search , 2007, SIGIR.

[17]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[18]  Hongyuan Zha,et al.  Learning User Clicks in Web Search , 2007, IJCAI.

[19]  Wei Vivian Zhang,et al.  Comparing Click Logs and Editorial Labels for Training Query Rewriting , 2007 .

[20]  Joshua Goodman,et al.  Finding advertising keywords on web pages , 2006, WWW '06.

[21]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[22]  Berthier A. Ribeiro-Neto,et al.  Impedance coupling in content-targeted advertising , 2005, SIGIR '05.

[23]  Lawrence Carin,et al.  Incomplete-data classification using logistic regression , 2005, ICML.

[24]  Thorsten Joachims,et al.  Eye-tracking analysis of user behavior in WWW search , 2004, SIGIR '04.

[25]  T. Minka A comparison of numerical optimizers for logistic regression , 2004 .

[26]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[27]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[28]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[29]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[30]  Daniel P. W. Ellis,et al.  Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[31]  Stanley F. Chen,et al.  A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .

[32]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[33]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[34]  R. Sugden Multiple Imputation for Nonresponse in Surveys , 1988 .

[35]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .