Random walks on the click graph

Search engines can record which documents were clicked for which query, and use these query-document pairs as "soft" relevance judgments. However, compared to the true judgments, click logs give noisy and sparse relevance information. We apply a Markov random walk model to a large click log, producing a probabilistic ranking of documents for a given query. A key advantage of the model is its ability to retrieve relevant documents that have not yet been clicked for that query and rank those effectively. We conduct experiments on click logs from image search, comparing our ("backward") random walk model to a different ("forward") random walk, varying parameters such as walk length and self-transition probability. The most effective combination is a long backward walk with high self-transition probability.

[1]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[2]  Ricardo A. Baeza-Yates,et al.  Modeling user search behavior , 2005, Third Latin American Web Congress (LA-WEB'2005).

[3]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[4]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[5]  Andrew Y. Ng,et al.  Learning random walk models for inducing word dependency distributions , 2004, ICML.

[6]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[7]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[8]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[9]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[11]  Mary Czerwinski,et al.  Semi-Automatic Image Annotation , 2001, INTERACT.

[12]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[13]  Wei-Ying Ma,et al.  Optimizing web search using web click-through data , 2004, CIKM '04.

[14]  Naftali Tishby,et al.  Data Clustering by Markovian Relaxation and the Information Bottleneck Method , 2000, NIPS.