Sources and characteristics of Web temporal locality

Temporal locality of reference in Web request streams emerges from two distinct phenomena: the long-term popularity of Web documents and the short-term temporal correlations of references. We show that the commonly-used distribution of inter-request times is predominantly determined by the power law governing the long-term popularity of documents. This inherent relationship tends to disguise the existence of short-term temporal correlations. We propose a new and robust metric that enables accurate characterization of that aspect of temporal locality. Using this metric, we characterize the locality of reference in a number of representative proxy cache traces. Our findings show that there are measurable differences between the degrees (and sources) of temporal locality across these traces.

[1]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[2]  Jeffrey R. Spirn,et al.  Distance String Models for Program Behavior , 1976, Computer.

[3]  Edith Cohen,et al.  Exploiting regularities in Web traffic patterns for cache replacement , 1999, STOC '99.

[4]  Virgílio A. F. Almeida,et al.  Characterizing reference locality in the WWW , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[5]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[6]  Eric A. Brewer,et al.  System Design Issues for Internet Middleware Services: Deductions from a Large Client Trace , 1997, USENIX Symposium on Internet Technologies and Systems.

[7]  Anirban Mahanti,et al.  Web Proxy Workload Characterisation And Modelling , 1999 .

[8]  S. Schwartz,et al.  Properties of the working-set model , 1972, OPSR.

[9]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[10]  T. J. Bergendahl,et al.  DIGITAL EQUIPMENT CORPORATION. , 1968, Analytical chemistry.

[11]  Sally A. McKee,et al.  Caches as filters: a new approach to cache analysis , 1998, Proceedings. Sixth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.98TB100247).

[12]  Azer Bestavros,et al.  GreedyDual* Web caching algorithm: exploiting the two sources of temporal locality in Web request streams , 2001, Comput. Commun..

[13]  Luigi Rizzo,et al.  Replacement policies for a proxy cache , 2000, TNET.

[14]  Peter J. Denning,et al.  Operating Systems Theory , 1973 .

[15]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[16]  Irving L. Traiger,et al.  Evaluation Techniques for Storage Hierarchies , 1970, IBM Syst. J..

[17]  Shudong Jin,et al.  Temporal Locality in Web Request Streams Sources, Characteristics, and Caching Implications* An Extended Abstract , 1999, SIGMETRICS 2000.