Catching the drift: learning broad matches from clickthrough data

Identifying similar keywords, known as broad matches, is an important task in online advertising that has become a standard feature on all major keyword advertising platforms. Effective broad matching leads to improvements in both relevance and monetization, while increasing advertisers' reach and making campaign management easier. In this paper, we present a learning-based approach to broad matching that is based on exploiting implicit feedback in the form of advertisement clickthrough logs. Our method can utilize arbitrary similarity functions by incorporating them as features. We present an online learning algorithm, Amnesiac Averaged Perceptron, that is highly efficient yet able to quickly adjust to the rapidly-changing distributions of bidded keywords, advertisements and user behavior. Experimental results obtained from (1) historical logs and (2) live trials on a large-scale advertising platform demonstrate the effectiveness of the proposed algorithm and the overall success of our approach in identifying high-quality broad match mappings.

[1]  P. Chatterjee,et al.  Modeling the Clickstream: Implications for Web-Based Advertising Efforts , 2003 .

[2]  Christopher Meek,et al.  Improving Similarity Measures for Short Segments of Text , 2007, AAAI.

[3]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[4]  Raymond J. Mooney,et al.  Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.

[5]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[6]  Daniel C. Fain,et al.  Predicting Click-Through Rate Using Keyword Clusters , 2006 .

[7]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[8]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[9]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[10]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[11]  Eugene Agichtein,et al.  Exploring mouse movements for inferring query intent , 2008, SIGIR '08.

[12]  Ariel Fuxman,et al.  Using the wisdom of the crowds for keyword generation , 2008, WWW.

[13]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[14]  Ben Carterette,et al.  Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks , 2007, NIPS.

[15]  Filip Radlinski,et al.  Optimizing relevance and revenue in ad search: a query substitution approach , 2008, SIGIR '08.

[16]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[17]  Anuradha Bhamidipaty,et al.  Interactive deduplication using active learning , 2002, KDD.

[18]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[19]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[20]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[21]  Vassilis Plachouras,et al.  Online learning from click data for sponsored search , 2008, WWW.

[22]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[23]  Ellen M. Voorhees Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[24]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[25]  Deepayan Chakrabarti,et al.  Contextual advertising by combining relevance with click feedback , 2008, WWW.

[26]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[27]  Mehran Sahami,et al.  A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[28]  Shenghuo Zhu,et al.  Learning multiple graphs for document recommendations , 2008, WWW.

[29]  Craig A. Knoblock,et al.  Learning domain-independent string transformation weights for high accuracy object identification , 2002, KDD.

[30]  Filip Radlinski,et al.  Search Engines that Learn from Implicit Feedback , 2007, Computer.

[31]  Susan T. Dumais,et al.  Similarity Measures for Short Segments of Text , 2007, ECIR.

[32]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[33]  Ping Zhang,et al.  UNDERSTANDING CONSUMERS ATTITUDE TOWARD ADVERTISING , 2002 .

[34]  Yoram Singer,et al.  The Forgetron: A Kernel-Based Perceptron on a Budget , 2008, SIAM J. Comput..

[35]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[36]  Eric Brill,et al.  Improving web search ranking by incorporating user behavior information , 2006, SIGIR.

[37]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[38]  Ron Kohavi,et al.  Practical guide to controlled experiments on the web: listen to your customers not to the hippo , 2007, KDD '07.

[39]  ChengXiang Zhai,et al.  A general optimization framework for smoothing language models on graph structures , 2008, SIGIR '08.

[40]  Eric Brill,et al.  Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users , 2004, EMNLP.

[41]  Ryen W. White,et al.  The Use of Implicit Evidence for Relevance Feedback in Web Retrieval , 2002, ECIR.