Design of efficient caching schemes for the world wide web

The Web has been so successful since its inception just a few short years ago that Web applications now generate more data traffic on the Internet than any other application. Dealing with the possible clogging of Internet links can be tackled in different ways, such as carefully dimensioning network resources, using multicast transmission for document delivery, and using client and proxy caching. We focus in this paper on issues related to caching, specifically on the problem of designing efficient caching algorithms. Most algorithms currently used in practice have not been designed specifically for the Web, but instead have been taken straight from earlier work on memory caches. We first argue that these algorithms are not well adapted to the Web environment. We then propose a method to design a cache algorithm that minimizes the document retrieval time perceived by a user, the impact of user requests on network resources, and the memory cost related to storing remote documents in the local cache. We find that such an algorithm must take network cost (i.e. the document transfer time) and memory cost (i.e. the document size) into account. This is unlike what is done in most cache algorithms currently used today.

[1]  Russell J. Clark,et al.  Providing scalable Web services using multicast communication , 1997, Comput. Networks ISDN Syst..

[2]  James A. Bucklew,et al.  Large Deviation Techniques in Decision, Simulation, and Estimation , 1990 .

[3]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[4]  Jean-Chrysostome Bolot,et al.  Performance Engineering of the World Wide Web: Application to Dimensioning and Cache Design , 1996, Comput. Networks.

[5]  Jeffrey C. Mogul,et al.  The case for persistent-connection HTTP , 1995, SIGCOMM '95.

[6]  Azer Bestavros,et al.  Application-level document caching in the Internet , 1995, Second International Workshop on Services in Distributed and Networked Environments.

[7]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[8]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[9]  Neil Smith,et al.  An Analysis of World-Wide Web Proxy Cache Performance and its Application to the Modelling and Simulation of Network Traffic , 1996 .

[10]  Edward A. Fox,et al.  Removal Policies in Network Caches for World-Wide Web Documents , 1996, SIGCOMM.

[11]  Steven Glassman,et al.  A Caching Relay for the World Wide Web , 1994, Comput. Networks ISDN Syst..

[12]  David G. Luenberger,et al.  Linear and nonlinear programming , 1984 .

[13]  Richard S. Hall,et al.  A case for caching file objects inside internetworks , 1993, SIGCOMM 1993.

[14]  Kimberly C. Claffy,et al.  Web Traffic Characterization: An Assesment of the Impact of Caching Documents from NCSA's Web Server , 1995, Comput. Networks ISDN Syst..