Spatio-temporal models for estimating click-through rate

We propose novel spatio-temporal models to estimate click-through rates in the context of content recommendation. We track article CTR at a fixed location over time through a dynamic Gamma-Poisson model and combine information from correlated locations through dynamic linear regressions, significantly improving on per-location model. Our models adjust for user fatigue through an exponential tilt to the first-view CTR (probability of click on first article exposure) that is based only on user-specific repeat-exposure features. We illustrate our approach on data obtained from a module (Today Module) published regularly on Yahoo! Front Page and demonstrate significant improvement over commonly used baseline methods. Large scale simulation experiments to study the performance of our models under different scenarios provide encouraging results. Throughout, all modeling assumptions are validated via rigorous exploratory data analysis.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Fang Wu,et al.  Novelty and collective attention , 2007, Proceedings of the National Academy of Sciences.

[3]  Y. Seifu,et al.  Using Spatial Statistics to Select Model Complexity , 2002 .

[4]  R. Kass,et al.  Spline‐based non‐parametric regression for periodic functions and its application to directional tuning of neurons , 2005, Statistics in medicine.

[5]  Calyampudi Radhakrishna Rao,et al.  Linear Statistical Inference and its Applications , 1967 .

[6]  Donald A. Berry,et al.  Meta-Analysis in Medicine and Health Policy , 2000 .

[7]  Michael A. West,et al.  Bayesian forecasting and dynamic models (2nd ed.) , 1997 .

[8]  Pravin K. Trivedi,et al.  Regression Analysis of Count Data , 1998 .

[9]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[10]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[11]  John M. Chambers,et al.  Software for Data Analysis: Programming with R , 2008 .

[12]  Nicolò Cesa-Bianchi,et al.  Finite-Time Regret Bounds for the Multiarmed Bandit Problem , 1998, ICML.

[13]  Kenneth Ward Church,et al.  Entropy of search logs: how hard is search? with personalization? with backoff? , 2008, WSDM '08.

[14]  C HUANHAI Adaptive Thresholds : Monitoring Streams of Network Counts Online , 2006 .

[15]  J. T. Wulu,et al.  Regression analysis of count data , 2002 .

[16]  Deepak Agarwal,et al.  Online Models for Content Optimization , 2008, NIPS.

[17]  N. L. Johnson,et al.  Linear Statistical Inference and Its Applications , 1966 .

[18]  Filip Radlinski,et al.  Active exploration for learning rankings from clickthrough data , 2007, KDD '07.

[19]  Fang Wu,et al.  Popularity, novelty and attention , 2008, EC '08.