Learning query and document similarities from click-through bipartite graph with metadata

We consider learning query and document similarities from a click-through bipartite graph with metadata on the nodes. The metadata contains multiple types of features of queries and documents. We aim to leverage both the click-through bipartite graph and the features to learn query-document, document-document, and query-query similarities. The challenges include how to model and learn the similarity functions based on the graph data. We propose solving the problems in a principled way. Specifically, we use two different linear mappings to project the queries and documents in two different feature spaces into the same latent space, and take the dot product in the latent space as their similarity. Query-query and document-document similarities can also be naturally defined as dot products in the latent space. We formalize the learning of similarity functions as learning of the mappings that maximize the similarities of the observed query-document pairs on the enriched click-through bipartite graph. When queries and documents have multiple types of features, the similarity function is defined as a linear combination of multiple similarity functions, each based on one type of features. We further solve the learning problem by using a new technique called Multi-view Partial Least Squares (M-PLS). The advantages include the global optimum which can be obtained through Singular Value Decomposition (SVD) and the capability of finding high quality similar queries. We conducted large scale experiments on enterprise search data and web search data. The experimental results on relevance ranking and similar query finding demonstrate that the proposed method works significantly better than the baseline methods.

[1]  H. Wold Path Models with Latent Variables: The NIPALS Approach , 1975 .

[2]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[3]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[4]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[5]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[6]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[7]  H. M. Blalock,et al.  Quantitative Sociology: International Perspectives on Mathematical and Statistical Modeling. , 1977 .

[8]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[9]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[10]  Andrei Z. Broder,et al.  Online expansion of rare queries for sponsored search , 2009, WWW '09.

[11]  Jingfang Xu,et al.  Learning similarity function for rare queries , 2011, WSDM '11.

[12]  Rajat Raina,et al.  Learning relevance from heterogeneous social network and its application in online targeting , 2011, SIGIR.

[13]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[14]  Roman Rosipal,et al.  Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.

[15]  Nicole Krämer,et al.  Partial least squares regression for graph mining , 2008, KDD.

[16]  Brian D. Davison Toward a unification of text and link analysis , 2003, SIGIR.

[17]  Edward A. Fox,et al.  SimFusion: measuring similarity using unified relationship matrix , 2005, SIGIR '05.

[18]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[19]  Ricardo A. Baeza-Yates,et al.  Extracting semantic relations from query logs , 2007, KDD '07.

[20]  Larry S. Davis,et al.  Human detection using partial least squares analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Michael R. Lyu,et al.  Learning latent semantic relations from clickthrough data for query suggestion , 2008, CIKM '08.

[22]  David R. Hardoon,et al.  KCCA for different level precision in content-based image retrieval , 2003 .

[23]  Wei Wu,et al.  Learning a Robust Relevance Model for Search Using Kernel Methods , 2011, J. Mach. Learn. Res..

[24]  Tobias Scheffer,et al.  Learning With Multiple Views , 2005 .

[25]  Zheng Chen,et al.  Latent semantic analysis for multiple-type interrelated data objects , 2006, SIGIR.

[26]  Michael R. Lyu,et al.  A generalized Co-HITS algorithm and its application to bipartite graphs , 2009, KDD.

[27]  R. Tobias An Introduction to Partial Least Squares Regression , 1996 .

[28]  Peter J. Schreier,et al.  A Unifying Discussion of Correlation Analysis for Complex Random Vectors , 2008, IEEE Transactions on Signal Processing.

[29]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[30]  Jacob A. Wegelin,et al.  A Survey of Partial Least Squares (PLS) Methods, with Emphasis on the Two-Block Case , 2000 .

[31]  Ji-Rong Wen,et al.  Query clustering using user logs , 2002, TOIS.

[32]  Hang Li,et al.  Relevance Ranking Using Kernels , 2010, AIRS.

[33]  Olfa Nasraoui,et al.  Mining search engine query logs for query recommendation , 2006, WWW '06.

[34]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[35]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[36]  S. Keleş,et al.  Sparse partial least squares regression for simultaneous dimension reduction and variable selection , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[37]  Ioannis Antonellis,et al.  Simrank++: query rewriting through link analysis of the clickgraph (poster) , 2007, Proc. VLDB Endow..

[38]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[39]  John Shawe-Taylor,et al.  Sparse canonical correlation analysis , 2009, Machine Learning.