Effective latent space graph-based re-ranking model with global consistency

Recently the re-ranking algorithms have been quite popular for web search and data mining. However, one of the issues is that those algorithms treat the content and link information individually. Inspired by graph-based machine learning algorithms, we propose a novel and general framework to model the re-ranking algorithm, by regularizing the smoothness of ranking scores over the graph, along with a regularizer on the initial ranking scores (which are obtained by the base ranker). The intuition behind the model is the global consistency over the graph: similar entities are likely to have the same ranking scores with respect to a query. Our approach simultaneously incorporates the content with other explicit or implicit link information in a latent space graph. Then an effective unified re-ranking algorithm is performed on the graph with respect to the query. To illustrate our methodology, we apply the framework to literature retrieval and expert finding applications on DBLP bibliography data. We compare the proposed method with the initial language model method and another PageRank-style re-ranking method. Also, we evaluate the proposed method with varying graphs and settings. Experimental results show that the improvement in our proposed method is consistent and promising.

[1]  Tao Qin,et al.  Learning to rank relational objects and its application to web search , 2008, WWW.

[2]  Fernando Diaz,et al.  Regularizing ad hoc retrieval scores , 2005, CIKM '05.

[3]  John D. Lafferty,et al.  Two-stage language models for information retrieval , 2002, SIGIR '02.

[4]  Oren Kurland,et al.  PageRank without hyperlinks: structural re-ranking using links induced by language models , 2005, SIGIR '05.

[5]  Shenghuo Zhu,et al.  Learning multiple graphs for document recommendations , 2008, WWW.

[6]  W. Bruce Croft,et al.  Hierarchical Language Models for Expert Finding in Enterprise Corpora , 2008, Int. J. Artif. Intell. Tools.

[7]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[8]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[9]  David Cohn,et al.  Learning to Probabilistically Identify Authoritative Documents , 2000, ICML.

[10]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[11]  Hua Li,et al.  Improving web search results using affinity graph , 2005, SIGIR '05.

[12]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[13]  Oren Kurland,et al.  Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models , 2006, SIGIR.

[14]  ChengXiang Zhai,et al.  Probabilistic Models for Expert Finding , 2007, ECIR.

[15]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[16]  Hang Li,et al.  Ranking refinement and its application to information retrieval , 2008, WWW.

[17]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[18]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[19]  Timothy A. Davis,et al.  Direct methods for sparse linear systems , 2006, Fundamentals of algorithms.

[20]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[21]  Hongbo Deng,et al.  Formal Models for Expert Finding on DBLP Bibliography Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[22]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[23]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[24]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[25]  Soumen Chakrabarti,et al.  Learning random walks to rank nodes in graphs , 2007, ICML '07.

[26]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[27]  Soumen Chakrabarti,et al.  Learning to rank networked entities , 2006, KDD '06.

[28]  William W. Cohen,et al.  Contextual search and name disambiguation in email using graphs , 2006, SIGIR.

[29]  Shenghua Bao,et al.  Research on Expert Search at Enterprise Track of TREC 2006 , 2005, TREC.

[30]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[31]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[32]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .