Optimizing Web Servers Using Page Rank Prefetching for Clustered Accesses

This paper presents a Page Rank based prefetching technique for accesses to Web page clusters. The approach uses the link structure of a requested page to determine the “most important” linked pages and to identify the page(s) to be prefetched. The underlying premise of our approach is that in the case of cluster accesses, the next pages requested by users of the Web server are typically based on the current and previous pages requested. Furthermore, if the requested pages have a lot of links to some “important” page, that page has a higher probability of being the next one requested. An experimental evaluation of the prefetching mechanism is presented using real server logs. The results show that the Page-Rank based scheme does better than random prefetching for clustered accesses, with hit rates of 90% in some cases.

[1]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[2]  Darrell D. E. Long,et al.  Exploring the Bounds of Web Latency Reduction from Caching and Prefetching , 1997, USENIX Symposium on Internet Technologies and Systems.

[3]  Qiang Yang,et al.  WhatNext: a prediction system for Web requests using n-gram sequence models , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[4]  Duane Wessels,et al.  Web Caching , 2001 .

[5]  Gerhard Weikum,et al.  Web Caching , 2003, Web & Datenbanken.

[6]  Leonard Kleinrock,et al.  An adaptive network prefetch scheme , 1998, IEEE J. Sel. Areas Commun..

[7]  Samuel D. Conte,et al.  Elementary Numerical Analysis: An Algorithmic Approach , 1975 .

[8]  Ingrid Zukerman,et al.  Predicting users' requests on the WWW , 1999 .

[9]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.