To determine the order in which to display web pages, the search engine Google computes the PageRank vector, whose entries are the PageRanks of the web pages. The PageRank vector is the stationary distribution of a stochastic matrix, the Google matrix. The Google matrix in turn is a convex combination of two stochastic matrices: one matrix represents the link structure of the web graph and a second, rank-one matrix, mimics the random behaviour of web surfers and can also be used to combat web spamming. As a consequence, PageRank depends mainly the link structure of the web graph, but not on the contents of the web pages. We analyze the sensitivity of PageRank to changes in the Google matrix, including addition and deletion of links in the web graph. Due to the proliferation of web pages, the dimension of the Google matrix most likely exceeds ten billion. One of the simplest and most storage-efficient methods for computing PageRank is the power method. We present error bounds for the iterates of the power method and fo their residuals. Key words: Markov matrix, stochastic matrix, stationary distribution, power method, perturbation bounds AMS subject classifications: 15A51, 65C40, 65F15, 65F50, 65F10
[1]
David F. Gleich,et al.
Fast Parallel PageRank: A Linear System Approach
,
2004
.
[2]
C. D. Meyer,et al.
Comparison of perturbation bounds for the stationary distribution of a Markov chain
,
2001
.
[3]
Hector Garcia-Molina,et al.
Combating Web Spam with TrustRank
,
2004,
VLDB.
[4]
Sergey Brin,et al.
The Anatomy of a Large-Scale Hypertextual Web Search Engine
,
1998,
Comput. Networks.
[5]
Rajeev Motwani,et al.
The PageRank Citation Ranking : Bringing Order to the Web
,
1999,
WWW 1999.
[6]
Lars Eldén.
The Eigenvalues of the Google Matrix
,
2004
.
[7]
Jasmine Novak,et al.
PageRank Computation and the Structure of the Web: Experiments and Algorithms
,
2002
.