Class-based cache management for dynamic Web content

Caching dynamic pages at a server site is beneficial in reducing server resource demands and it also helps dynamic page caching at proxy sites. Previous work has used fine-grain dependence graphs among individual dynamic pages and underlying data sets to enforce result consistency. This paper proposes a complementary solution for applications that require coarse-grain cache management. The key idea is to partition dynamic pages into classes based on URL patterns so that an application can specify page identification and data dependence, and invoke invalidation for a class of dynamic pages. To make this scheme time-efficient with small space requirement, lazy invalidation is used to minimize slow disk accesses when IDs of dynamic pages are stored in memory with a digest format. Selective precomputing is further proposed to refresh stale pages and smoothen load peaks. A data structure is developed for efficient URL class searching during lazy or eager invalidation. This paper also presents design and implementation of a caching system called Cachuma which integrates the above techniques, runs in tandem with standard Web servers, and allows Web sites to add dynamic page caching capability with minimal changes. The experimental results show that the proposed techniques are effective in supporting coarse-grain cache management and reducing server response times for tested applications.

[1]  Daniel A. Menascé,et al.  Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning , 2000 .

[2]  M. V. Wilkes,et al.  The Art of Computer Programming, Volume 3, Sorting and Searching , 1974 .

[3]  Margo I. Seltzer,et al.  World Wide Web Cache Consistency , 1996, USENIX Annual Technical Conference.

[4]  Amin Vahdat,et al.  Transparent Result Caching , 1997, USENIX Annual Technical Conference.

[5]  Fred Douglis,et al.  HPP: HTML Macro-Preprocessing to Support Dynamic Document Caching , 1997, USENIX Symposium on Internet Technologies and Systems.

[6]  Peter,et al.  Io-lite: a Uniied I/o Buuering and Caching System , 1997 .

[7]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.

[8]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[9]  Arun Iyengar,et al.  Improving Web Server Performance by Caching Dynamic Data , 1997, USENIX Symposium on Internet Technologies and Systems.

[10]  Scott Shenker,et al.  A scalable Web cache consistency architecture , 1999, SIGCOMM '99.

[11]  Tao Yang,et al.  Neptune: Scalable Replication Management and Programming Support for Cluster-based Network Services , 2001, USITS.

[12]  Fred Douglis,et al.  Optimistic deltas for WWW latency reduction , 1997 .

[13]  Tao Yang,et al.  Cooperative caching of dynamic content on a distributed Web server , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[14]  Oscar H. Ibarra,et al.  Adaptive load sharing for clustered digital library servers , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[15]  Lorenzo Alvisi,et al.  Hierarchical Cache Consistency in WAN Extended , 1999 .

[16]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[17]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[18]  R. Bain Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. By George Kingsley Zipf. Cambridge, Mass.: Addison-Wesley Press, Inc., 1949. 573 pp. $6.50 , 1950 .

[19]  Sandy Irani,et al.  Cost-Aware WWW Proxy Caching Algorithms , 1997, USENIX Symposium on Internet Technologies and Systems.

[20]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[21]  David R. Cheriton,et al.  Scalable Web Caching of Frequently Updated Objects Using Reliable Multicast , 1999, USENIX Symposium on Internet Technologies and Systems.

[22]  Darrell D. E. Long,et al.  Exploring the Bounds of Web Latency Reduction from Caching and Prefetching , 1997, USENIX Symposium on Internet Technologies and Systems.

[23]  Tao Yang,et al.  Scheduling optimization for resource-intensive Web requests on server clusters , 1999, SPAA '99.

[24]  Arun Iyengar,et al.  A scalable system for consistently caching dynamic Web data , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[25]  Prashant J. Shenoy,et al.  Adaptive leases: a strong consistency mechanism for the World Wide Web , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[26]  Wei Lin,et al.  Web prefetching between low-bandwidth clients and proxies: potential and performance , 1999, SIGMETRICS '99.

[27]  Oscar H. Ibarra,et al.  The WWW Prototype of the Alexandria Digital Library , 1995 .

[28]  Kai Shen,et al.  Adaptive Algorithms for Cache-Efficient Trie Search , 1998, ALENEX.

[29]  Tao Yang,et al.  Exploiting Result Equivalence in Caching Dynamic Web Content , 1999, USENIX Symposium on Internet Technologies and Systems.

[30]  Virgílio A. F. Almeida,et al.  Resource management policies for e-commerce servers , 2000, PERV.

[31]  Philip S. Yu,et al.  Analysis of Task Assignment Policies in Scalable Distributed Web-Server Systems , 1998, IEEE Trans. Parallel Distributed Syst..

[32]  Eric A. Brewer,et al.  Cluster-based scalable network services , 1997, SOSP.

[33]  Jeffrey C. Mogul,et al.  Scalable Kernel Performance for Internet Servers Under Realistic Loads , 1998, USENIX Annual Technical Conference.

[34]  Duane Wessels Squid internet object cache , 1996 .

[35]  Erich M. Nahum,et al.  Locality-aware request distribution in cluster-based network servers , 1998, ASPLOS VIII.

[36]  Jin Zhang,et al.  Active Cache: caching dynamic contents on the Web , 1999, Distributed Syst. Eng..