A Survey of Eigenvector Methods for Web Information Retrieval

Web information retrieval is significantly more challenging than traditional well-controlled, small document collection information retrieval. One main difference between traditional information retrieval and Web information retrieval is the Web's hyperlink structure. This structure has been exploited by several of today's leading Web search engines, particularly Google and Teoma. In this survey paper, we focus on Web information retrieval methods that use eigenvector computations, presenting the three popular methods of HITS, PageRank, and SALSA.

[1]  Steven K. Donoho,et al.  Link Analysis , 2005, Data Mining and Knowledge Discovery Handbook.

[2]  Franco Scarselli,et al.  PageRank: A Circuital Analysis , 2002 .

[3]  Krishna Bharat,et al.  The Term Vector Database: fast access to indexing terms for Web pages , 2000, Comput. Networks.

[4]  Chris H. Q. Ding,et al.  PageRank, HITS and a unified framework for link analysis , 2002, SIGIR '02.

[5]  David Cohn,et al.  Learning to Probabilistically Identify Authoritative Documents , 2000, ICML.

[6]  Dell Zhang,et al.  An efficient algorithm to rank Web resources , 2000, Comput. Networks.

[7]  Allan Borodin,et al.  Finding authorities and hubs from link structures on the World Wide Web , 2001, WWW '01.

[8]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[9]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[10]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[11]  Massimo Marchiori,et al.  The Quest for Correct Information on the Web: Hyper Search Engines , 1997, Comput. Networks.

[12]  C. D. Meyer,et al.  Updating the stationary vector of an irreducible Markov chain , 2002 .

[13]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[14]  Maximino Aldana-Gonzalez,et al.  Linked: The New Science of Networks , 2003 .

[15]  Joel C. Miller,et al.  Modifications of Kleinberg's HITS algorithm using matrix exponentiation and web log records , 2001, SIGIR '01.

[16]  D. Szyld,et al.  Application of threshold partitioning of sparse matrices to Markov chains , 1996, Proceedings of IEEE International Computer Performance and Dependability Symposium.

[17]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[18]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[19]  William J. Stewart,et al.  Introduction to the numerical solution of Markov Chains , 1994 .

[20]  Amy Nicole Langville,et al.  A Reordering for the PageRank Problem , 2005, SIAM J. Sci. Comput..

[21]  Taher H. Haveliwala Efficient Computation of PageRank , 1999 .

[22]  Wei Wu,et al.  Numerical Experiments with Iteration and Aggregation for Markov Chains , 1992, INFORMS J. Comput..

[23]  Alberto O. Mendelzon,et al.  What is this page known for? Computing Web page reputations , 2000, Comput. Networks.

[24]  Stephen Huang Improving Retrieval by Querying and Examining Prestige , 2002 .

[25]  John A. Tomlin,et al.  A new paradigm for ranking pages on the world wide web , 2003, WWW '03.

[26]  Carl D. Meyer,et al.  Matrix Analysis and Applied Linear Algebra , 2000 .

[27]  Daniel B. Szyld,et al.  Experimental study of parallel iterative solutions of Markov chains with block partitions , 1999 .

[28]  Taher H. Haveliwala,et al.  Adaptive methods for the computation of PageRank , 2004 .

[29]  Ah Chung Tsoi,et al.  Adaptive ranking of web pages , 2003, WWW '03.

[30]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[31]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[32]  Krishna Bharat,et al.  When experts agree: using non-affiliated experts to rank popular topics , 2002, ACM Trans. Inf. Syst..

[33]  Craig Silverstein,et al.  Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[34]  Padma Raghavan,et al.  Level search schemes for information filtering and retrieval , 2001, Inf. Process. Manag..

[35]  Krishna Bharat,et al.  Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.

[36]  David A. Cohn,et al.  Creating customized authority lists , 1999, ICML 1999.

[37]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[38]  Michael I. Jordan,et al.  Link Analysis, Eigenvectors and Stability , 2001, IJCAI.

[39]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[40]  Peiling Wang,et al.  Mining longitudinal web queries: Trends and patterns , 2003, J. Assoc. Inf. Sci. Technol..

[41]  Chris H. Q. Ding,et al.  Link Analysis: Hubs and Authorities on the World Wide Web , 2004, SIAM Rev..

[42]  Jasmine Novak,et al.  PageRank Computation and the Structure of the Web: Experiments and Algorithms , 2002 .

[43]  Taher H. Haveliwala,et al.  The Second Eigenvalue of the Google Matrix , 2003 .

[44]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[45]  Fritz Schneider,et al.  How to Do Everything with Google , 2003 .

[46]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[47]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.

[48]  Elizabeth R. Jessup,et al.  Matrices, Vector Spaces, and Information Retrieval , 1999, SIAM Rev..

[49]  Alberto O. Mendelzon,et al.  An Autonomous Page Ranking Method for Metasearch Engines , 2002, WWW 2002.

[50]  Tugrul Dayar,et al.  Comparison of Partitioning Techniques for Two-Level Iterative Solvers on Large, Sparse Markov Chains , 1999, SIAM J. Sci. Comput..

[51]  Eli Upfal,et al.  Using PageRank to Characterize Web Structure , 2002, Internet Math..

[52]  Ilse C. F. Ipsen,et al.  Improving the Accuracy of Inverse Iteration , 1992, SIAM J. Sci. Comput..

[53]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.