Click-graph modeling for facet attribute estimation of web search queries

We use clickthrough data of a Japanese commercial search engine to evaluate the similarity between a query and a facet category from the patterns of clicks on URLs. Using a small number of seed queries, we extract a set of topical words forming search queries together with the same facet directive words, e.g., 'recipe' in 'curry recipe' or 'apple pie recipe'. We used a PageRank-like random walk approach on query-URL bipartite graphs called "Biased ClickRank" to propagate facet attributes through click bipartite graphs. We noticed that queries to URL links are too sparse to capture query variations whereas queries to domain links are too coarse to discriminate among the different usages of broadly related queries. We introduced edges and vertices corresponding to the decomposed URL paths into the click graph to capture the click pattern differences at an appropriate granularity level. Our expanded graph model improved recalls as well as average precision against baseline graph models.

[1]  Ciya Liao,et al.  A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine , 2010, WSDM '10.

[2]  Andrei Z. Broder,et al.  Robust classification of rare queries using web knowledge , 2007, SIGIR.

[3]  Ricardo A. Baeza-Yates,et al.  Extracting semantic relations from query logs , 2007, KDD '07.

[4]  Wei-Ying Ma,et al.  Optimizing web search using web click-through data , 2004, CIKM '04.

[5]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[6]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[7]  Georges Dupret,et al.  Recommending Better Queries from Click-Through Data , 2005, SPIRE.

[8]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[9]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[10]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[11]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[12]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[13]  Rajeev Motwani,et al.  Randomized algorithms , 1996, CSUR.

[14]  Min-Yen Kan,et al.  Functional Faceted Web Query Analysis , 2007 .

[15]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[16]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[17]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.