论文信息 - Dirichlet PageRank

Dirichlet PageRank

PageRank has been known to be a successful algorithm in ranking web sources. In order to avoid the rank sink problem, PageRank assumes that a surfer, being in a page, jumps to a random page with a certain probability. In the standard PageRank algorithm, the jumping probabilities are assumed to be the same for all the pages, regardless of the page properties. This is not the case in the real world, since presumably a surfer would more likely follow the out-links of a high-quality hub page than follow the links of a low-quality one. In this poster, we propose a novel algorithm "Dirichlet PageRank" to address this problem by adapting exible jumping probabilities based on the number of out-links in a page. Empirical results on TREC data show that our method outperforms the standard PageRank algorithm.

Azadeh Shakery | Tao Tao | Xuanhui Wang

[1] Wei-Ying Ma,et al. Block-level link analysis , 2004, SIGIR '04.

[2] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3] John D. Lafferty,et al. A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.