Simrank++: query rewriting through link analysis of the clickgraph (poster)

We focus on the problem of query rewriting for sponsored search. We base rewrites on a historical click graph that records the ads that have been clicked on in response to past user queries. Given a query q, we first consider Simrank [2] as a way to identify queries similar to q, i.e., queries whose ads a user may be interested in. We argue that Simrank fails to properly identify query similarities in our application, and we present two enhanced versions of Simrank: one that exploits weights on click graph edges and another that exploits evidence." We experimentally evaluate our new schemes against Simrank, using actual click graphs and queries form Yahoo!, and using a variety of metrics. Our results show that the enhanced methods can yield more and better query rewrites.

[1]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[2]  Ji-Rong Wen,et al.  Query clustering using user logs , 2002, TOIS.

[3]  Ian Ruthven,et al.  Re-examining the potential effectiveness of interactive query expansion , 2003, SIGIR.

[4]  Daniel C. Fain,et al.  Predicting Click-Through Rate Using Keyword Clusters , 2006 .

[5]  Ioannis Antonellis,et al.  Simrank++: query rewriting through link analysis of the clickgraph (poster) , 2007, Proc. VLDB Endow..

[6]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[7]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[8]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[9]  Charles L. A. Clarke,et al.  Scoring missing terms in information retrieval tasks , 2004, CIKM '04.

[10]  Xiaofei He,et al.  Query rewriting using active learning for sponsored search , 2007, SIGIR.

[11]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[12]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[13]  Matthew Richardson,et al.  Predicting clicks: estimating the click-through rate for new ads , 2007, WWW '07.

[14]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[15]  Rosie Jones,et al.  Query word deletion prediction , 2003, SIGIR.

[16]  Alistair Sinclair,et al.  Algorithms for Random Generation and Counting: A Markov Chain Approach , 1993, Progress in Theoretical Computer Science.

[17]  Wei Vivian Zhang,et al.  Comparing Click Logs and Editorial Labels for Training Query Rewriting , 2007 .

[18]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.