Inside PageRank

Although the interest of a Web page is strictly related to its content and to the subjective readers' cultural background, a measure of the page authority can be provided that only depends on the topological structure of the Web. PageRank is a noticeable way to attach a score to Web pages on the basis of the Web connectivity. In this article, we look inside PageRank to disclose its fundamental properties concerning stability, complexity of computational scheme, and critical role of parameters involved in the computation. Moreover, we introduce a circuit analysis that allows us to understand the distribution of the page score, the way different Web communities interact each other, the role of dangling pages (pages with no outlinks), and the secrets for promotion of Web pages.

[1]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[2]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[3]  Taher H. Haveliwala Efficient Computation of PageRank , 1999 .

[4]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[5]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[6]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[7]  Allan Borodin,et al.  Finding authorities and hubs from link structures on the World Wide Web , 2001, WWW '01.

[8]  Rajeev Motwani,et al.  What can you do with a Web in your Pocket? , 1998, IEEE Data Eng. Bull..

[9]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[10]  Massimo Marchiori,et al.  The Quest for Correct Information on the Web: Hyper Search Engines , 1997, Comput. Networks.

[11]  I. Bomze,et al.  The dynamics of self-evaluation , 1994 .

[12]  Åke Björck,et al.  Numerical methods for least square problems , 1996 .

[13]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[14]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[15]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[16]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[17]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[18]  Marco Gori,et al.  Web page scoring systems for horizontal and vertical search , 2002, WWW.

[19]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[20]  J. Gillis,et al.  Matrix Iterative Analysis , 1961 .

[21]  Lloyd Allison,et al.  What is a Tall Poppy Among Web Pages? , 1998, Comput. Networks.

[22]  Dell Zhang,et al.  An efficient algorithm to rank Web resources , 2000, Comput. Networks.

[23]  Michael I. Jordan,et al.  Link Analysis, Eigenvectors and Stability , 2001, IJCAI.

[24]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[25]  Krishna Bharat,et al.  Improved algorithms for topic distillation in a hyperlinked environment , 1998, SIGIR '98.

[26]  Monika Henzinger,et al.  Hyperlink Analysis for the Web , 2001, IEEE Internet Comput..

[27]  Walter J. Gutjahr,et al.  Estimating qualifications in a self-evaluating group , 1995 .

[28]  David Cohn,et al.  Learning to Probabilistically Identify Authoritative Documents , 2000, ICML.

[29]  Marco Gori,et al.  Optimal Algorithms for Well-Conditioned Nonlinear Systems of Equations , 2001, IEEE Trans. Computers.