Performance impact of large file transfer on web proxy caching: A case study in a high bandwidth campus network environment

Since large objects consume substantial resources, web proxy caching incurs a fundamental trade-off between performance (i.e., hit-ratio and latency) and overhead (i.e., resource usage), in terms of caching and relaying large objects to users. This paper investigates how and to what extent the current dedicated-server based web proxy caching scheme is affected by large file transfers in a high bandwidth campus network environment. We use a series of trace-based performance analyses and profiling of various resource components in our experimental squid proxy cache server. Large file transfers often overwhelm our cache server. This causes a bottleneck in a web network, by saturating the network bandwidth of the cache server. Due to the requests for large objects, response times required for delivery of concurrently requested small objects increase, by a factor as high as a few million, in the worst cases. We argue that this cache bandwidth bottleneck problem is due to the fundamental limitations of the current centralized web proxy caching model that scales poorly when there are a limited amount of dedicated resources. This is a serious threat to the viability of the current web proxy caching model, particularly in a high bandwidth access network, since it leads to sporadic disconnections of the downstream access network from the global web network. We propose a peer-to-peer cooperative web caching scheme to address the cache bandwidth bottleneck problem. We show that it performs the task of caching and delivery of large objects in an efficient and cost-effective manner, without generating significant overheads for participating peers.

[1]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[2]  Venkata N. Padmanabhan,et al.  The Case for Cooperative Networking , 2002, IPTPS.

[3]  Luigi Rizzo,et al.  Dummynet: a simple approach to the evaluation of network protocols , 1997, CCRV.

[4]  Paul Barford,et al.  The network effects of prefetching , 1998, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[5]  Dongman Lee,et al.  A Measurement Study of Storage Resource and Multimedia Contents on a High-Performance Research and Education Network , 2003, HSNMC.

[6]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[7]  Alex Rousskov On Performance of Caching Proxies , 1998, SIGMETRICS 1998.

[8]  Indranil Gupta,et al.  Kache : Peer-to-Peer Web Caching Using Kelips , 2004 .

[9]  Dongman Lee,et al.  Proactive Web caching with cumulative prefetching for large multimedia data , 2000, Comput. Networks.

[10]  Mor Harchol-Balter,et al.  Connection Scheduling in Web Servers , 1999, USENIX Symposium on Internet Technologies and Systems.

[11]  Mor Harchol-Balter,et al.  Size-based scheduling to improve web performance , 2003, TOCS.

[12]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[13]  Weisong Shi,et al.  Peer-to-peer Web caching: hype or reality? , 2004, Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004..

[14]  Micah Beck,et al.  The Internet Backplane Protocol: Storage in the Network , 1999 .

[15]  Michael Dahlin,et al.  Design considerations for distributed caching on the Internet , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[16]  Peter B. Danzig,et al.  A Hierarchical Internet Object Cache , 1996, USENIX ATC.

[17]  Indranil Gupta,et al.  A churn-resistant peer-to-peer web caching system , 2003, SSRS '03.

[18]  Zhiyong Xu,et al.  Exploiting client cache: a scalable and efficient approach to build large Web cache , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[19]  Prashant J. Shenoy,et al.  Implications of proxy caching for provisioning networks and servers , 2000, SIGMETRICS '00.

[20]  Van Jacobson,et al.  Adaptive web caching: towards a new global caching architecture , 1998, Comput. Networks.

[21]  Micah Beck,et al.  An end-to-end approach to globally scalable network storage , 2002, SIGCOMM '02.

[22]  Martin F. Arlitt,et al.  Workload characterization of a Web proxy in a cable modem environment , 1999, PERV.

[23]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[24]  Anja Feldmann,et al.  Performance of Web Proxy Caching in Heterogenous Environments , 1999, INFOCOM 1999.

[25]  Mor Harchol-Balter,et al.  Web servers under overload: How scheduling can help , 2006, TOIT.

[26]  Antony I. T. Rowstron,et al.  Squirrel: a decentralized peer-to-peer web cache , 2002, PODC '02.

[27]  Kilnam Chon,et al.  Replicache: a New Approach to Scalable Network Storage System for Large Objects , 2007 .

[28]  Eric A. Brewer,et al.  Cluster-based scalable network services , 1997, SOSP.

[29]  Yinglian Xie,et al.  A secure, publisher-centric Web caching infrastructure , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[30]  Mary Baker,et al.  Peer-to-Peer Caching Schemes to Address Flash Crowds , 2002, IPTPS.

[31]  Weisong Shi,et al.  Performance evaluation of peer-to-peer Web caching systems , 2006, J. Syst. Softw..