Traffic Properties, Client Side Cachability and CDN Usage of Popular Web Sites

Web traffic measurement and modeling have contributed to understanding the effect of Web traffic on Internet resources since the 1990s. In the past years, a number of new Web features have gained more and more importance, e.g. content delivery networks (CDNs), increased amount of advertisement, personalization, usage tracking, client scripting and Web 2.0 style “mashups”. This paper uses active Web measurements to assess the efficiency of client side caching for modern Web sites, investigating some Web features in detail. As expected, we see that more than 50 % of the average downstream traffic volume is saved when loading a page using client side caching. More unexpected results comprise the actual distribution of cache effectiveness, varying between extreme and no reduction of traffic, the cachability of “Web bugs” and the variance between sites in cachable image pixels and CDN based files.

[1]  Dan Boneh,et al.  Protecting browser state from web privacy attacks , 2006, WWW '06.

[2]  Joel Wein,et al.  ACMS: the Akamai configuration management system , 2005, NSDI.

[3]  Benjamin Livshits,et al.  AjaxScope: A Platform for Remotely Monitoring the Client-Side Behavior of Web 2.0 Applications , 2010, ACM Trans. Web.

[4]  Trevor N. Mudge,et al.  Web latency reduction via client-side prefetching , 2000, 2000 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS (Cat. No.00EX422).

[5]  Geoffrey M. Voelker,et al.  Characterization of a Large Web Site Population with Implications for Content Delivery , 2004, WWW '04.

[6]  Virgílio A. F. Almeida,et al.  Traffic Characteristics and Communication Patterns in Blogosphere , 2006, ICWSM.

[7]  Joachim Charzinski Traffic, Structure and Locality Characteristics of the Web's Most Popular Services' Home Pages , 2009, KiVS.

[8]  Benjamin Livshits,et al.  AjaxScope: a platform for remotely monitoring the client-side behavior of web 2.0 applications , 2007, TWEB.

[9]  Raymond Yee Pro Web 2.0 Mashups: Remixing Data and Web Services , 2008 .

[10]  G. Barish,et al.  World Wide Web caching: trends and techniques , 2000, IEEE Commun. Mag..

[11]  Anirban Mahanti,et al.  Traffic analysis of a Web proxy caching hierarchy , 2000 .

[12]  Michael J. Feeley,et al.  The Measured Access Characteristics of World-Wide-Web Client Proxy Caches , 1997, USENIX Symposium on Internet Technologies and Systems.

[13]  Martin Arlitt,et al.  Web Workload Characterization: Ten Years Later , 2005 .

[14]  Jean-Chrysotome Bolot End-to-end packet delay and loss behavior in the internet , 1993, SIGCOMM 1993.

[15]  Marwan Krunz,et al.  Performance analysis of a client-side caching/prefetching system for Web traffic , 2007, Comput. Networks.

[16]  Paul Barford,et al.  Measuring Web performance in the wide area , 1999, PERV.

[17]  Krishna P. Gummadi,et al.  An analysis of Internet content delivery systems , 2002, OPSR.

[18]  Balachander Krishnamurthy,et al.  Analyzing factors that influence end-to-end Web performance , 2000, Comput. Networks.