WebWave: globally load balanced fully distributed caching of hot published documents

Document publication service over such a large network as the Internet challenges us to harness available server and network resources to meet fast growing demand. We show that large scale dynamic caching can be employed to globally minimize server idle time, and hence maximize the aggregate server throughput of the whole service. To be efficient, scalable and robust, a successful caching mechanism must have three properties: (1) maximize the global throughput of the system; (2) find cache copies without recourse to a directory service, or to a discovery protocol; and (3) be completely distributed in the sense of operating only on the basis of local information. We develop a precise definition, which we call tree load balance (TLB), of what it means for a mechanism to satisfy these three goals. We present an algorithm that computes TLB offline, and a distributed protocol that induces a load distribution that converges quickly to a TLB one. Both algorithms place cache copies of immutable documents on the routing tree that connects the cached document's home server to its clients, thus enabling requests to stumble on cache copies en route to the home server.

[1]  Brian N. Bershad,et al.  Efficient Packet Demultiplexing for Multiple Endpoints and Large Messages , 1994, USENIX Winter.

[2]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[3]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[4]  Michelle Butler,et al.  A Scalable HTTP Server: The NCSA Prototype , 1994, Comput. Networks ISDN Syst..

[5]  Duane Wessels,et al.  Internet Cache Protocol (ICP), version 2 , 1997, RFC.

[6]  Peter Honeyman,et al.  Multi-level Caching in Distributed File Systems or Your cache ain't nuthin' but trash , 1992 .

[7]  David A. Goldberg,et al.  Design and Implementation of the Sun Network Filesystem , 1985, USENIX Conference Proceedings.

[8]  David A. Patterson,et al.  Cooperative Caching: Using Remote Client Memory , 1994 .

[9]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1997, TNET.

[10]  Dawson R. Engler,et al.  DPF: Fast, Flexible Message Demultiplexing Using Dynamic Code Generation , 1996, SIGCOMM.

[11]  James Gwertzman,et al.  Autonomous Replication in Wide-Area Internetworks , 1995 .

[12]  Jeffrey C. Mogul,et al.  Network Behavior of a Busy Web Server and its Clients , 1999 .

[13]  Peter B. Danzig,et al.  A Hierarchical Internet Object Cache , 1996, USENIX ATC.

[14]  Mark Crovella,et al.  Dynamic Server Selection using Bandwidth Probing in Wide-Area Networks , 1996 .

[15]  Reinhard Lüling,et al.  Load balancing for distributed branch & bound algorithms , 1992, Proceedings Sixth International Parallel Processing Symposium.

[16]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[17]  George Cybenko,et al.  Dynamic Load Balancing for Distributed Memory Multiprocessors , 1989, J. Parallel Distributed Comput..

[18]  Azer Bestavros,et al.  Speculative data dissemination and service to reduce server load, network traffic and service time in distributed information systems , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[19]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[20]  Dawson R. Engler,et al.  DPF: fast, flexible message demultiplexing using dynamic code generation , 1996, SIGCOMM 1996.

[21]  Domenico Ferrari,et al.  An Empirical Investigation of Load Indices for Load Balancing Applications , 1987, Performance.

[22]  Michael Dahlin,et al.  Cooperative caching: using remote client memory to improve file system performance , 1994, OSDI '94.

[23]  M. Chen,et al.  From Local to Global: An Analysis of Nearest Neighbor Balancing on Hypercubes , 1988, SIGMETRICS.

[24]  Francis C. M. Lau,et al.  Optimal Parameters for Load Balancing Using the Diffusion Method in k-Ary n-Cube Networks , 1993, Inf. Process. Lett..

[25]  Larry L. Peterson,et al.  PathFinder: A Pattern-Based Packet Classifier , 1994, OSDI.

[26]  Matthew Addison Blaze Caching in large-scale distributed file systems , 1993 .

[27]  Michael F. Schwartz,et al.  Locating nearby copies of replicated Internet servers , 1995, SIGCOMM '95.

[28]  Jacques E. Boillat,et al.  Load Balancing and Poisson Equation in a Graph , 1990, Concurr. Pract. Exp..

[29]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .