Efficient Behavior Targeting Using SVM Ensemble Indexing

Behavior targeting (BT) is a promising tool for online advertising. The state-of-the-art BT methods, which are mainly based on regression models, have two limitations. First, learning regression models for behavior targeting is difficult since user clicks are typically several orders of magnitude fewer than views. Second, the user interests are not fixed, but often transient and influenced by media and pop culture. In this paper, we propose to formulate behavior targeting as a classification problem. Specifically, we propose to use an SVM ensemble for behavior prediction. The challenge of using ensemble SVM for BT stems from the computational complexity (it takes 53 minutes in our experiments to predict behavior for 32 million users, which is inadequate for online application). To this end, we propose a fast ensemble SVM prediction framework, which builds an indexing structure for SVM ensemble to achieve sub-linear prediction time complexity. Experimental results on real-world large scale behavior targeting data demonstrate that the proposed method is efficient and outperforms existing linear regression based BT models.

[1]  Tie-Yan Liu,et al.  Actively predicting diverse search intent from user browsing behaviors , 2010, WWW '10.

[2]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[3]  Sergei Vassilvitskii,et al.  Indexing Boolean Expressions , 2009, Proc. VLDB Endow..

[4]  Ashwin Machanavajjhala,et al.  Scalable ranked publish/subscribe , 2008, Proc. VLDB Endow..

[5]  Ira Assent,et al.  The ClusTree: indexing micro-clusters for anytime stream mining , 2011, Knowledge and Information Systems.

[6]  Hyun-Chul Kim,et al.  Constructing support vector machine ensemble , 2003, Pattern Recognit..

[7]  S. Muthukrishnan,et al.  Efficient algorithms for document retrieval problems , 2002, SODA '02.

[8]  Qiang Yang,et al.  Query enrichment for web-query classification , 2006, TOIS.

[9]  John F. Canny,et al.  Large-scale behavioral targeting , 2009, KDD.

[10]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[11]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[12]  Jennifer Widom,et al.  StreaMon: an adaptive engine for stream query processing , 2004, SIGMOD '04.

[13]  Enhong Chen,et al.  Context-aware query classification , 2009, SIGIR.

[14]  Hao Yang,et al.  Near-optimal algorithms for shared filter evaluation in data stream systems , 2008, SIGMOD Conference.

[15]  Yelong Shen,et al.  Sparse hidden-dynamics conditional random fields for user intent understanding , 2011, WWW.

[16]  Li Guo,et al.  Enabling Fast Lazy Learning for Data Streams , 2011, 2011 IEEE 11th International Conference on Data Mining.

[17]  Li Guo,et al.  Enabling fast prediction for ensemble models on data streams , 2011, KDD.

[18]  Ying Chen,et al.  Probabilistic latent semantic user segmentation for behavioral targeted advertising , 2009, KDD Workshop on Data Mining and Audience Intelligence for Advertising.

[19]  Qiang Yang,et al.  Building bridges for web query classification , 2006, SIGIR.

[20]  Enhong Chen,et al.  Towards context-aware search by learning a very large variable length hidden markov model from search logs , 2009, WWW '09.