Internet Web servers: workload characterization and performance implications

This paper presents a workload characterization study for Internet Web servers. Six different data sets are used in the study: three from academic environments, two from scientific research organizations, and one from a commercial Internet provider. These data sets represent three different orders of magnitude in server activity, and two different orders of magnitude in time duration, ranging from one week of activity to one year. The workload characterization focuses on the document type distribution, the document size distribution, the document referencing behavior, and the geographic distribution of server requests. Throughout the study, emphasis is placed on finding workload characteristics that are common to all the data sets studied. Ten such characteristics are identified. The paper concludes with a discussion of caching and performance issues, using the observed workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.

[1]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[2]  kc claffy,et al.  Evolution of the NLANR cache hierarchy: Global configuration challenges , 1996 .

[3]  Virgílio A. F. Almeida,et al.  Characterizing reference locality in the WWW , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[4]  Daniel A. Reed,et al.  NCSA's World Wide Web Server: Design and Performance , 1995, Computer.

[5]  Edward A. Fox,et al.  Removal policies in network caches for World-Wide Web documents , 1996, SIGCOMM '96.

[6]  Vern Paxson,et al.  Empirically derived analytic models of wide-area TCP connections , 1994, TNET.

[7]  Richard S. Hall,et al.  A case for caching file objects inside internetworks , 1993, SIGCOMM '93.

[8]  Edward A. Fox,et al.  Caching Proxies: Limitations and Potentials , 1995, WWW.

[9]  Lloyd S. Nelson Encyclopedia of Statistical Sciences, Vols 1 and 2 , 1983 .

[10]  E CrovellaMark,et al.  Self-similarity in World Wide Web traffic , 1996 .

[11]  Richard B. Bunt,et al.  The effect of client caching on file server workloads , 1996, Proceedings of HICSS-29: 29th Hawaii International Conference on System Sciences.

[12]  G. Jasso Review of "International Encyclopedia of Statistical Sciences, edited by Samuel Kotz, Norman L. Johnson, and Campbell B. Read, New York, Wiley, 1982-1988" , 1989 .

[13]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[14]  Robert E. McGrath,et al.  Web server technology: the advanced guide for World Wide Web information providers , 1996 .

[15]  Azer Bestavros,et al.  Application-level document caching in the Internet , 1995, Second International Workshop on Services in Distributed and Networked Environments.

[16]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[17]  Mark Crovella,et al.  Characteristics of WWW Client-based Traces , 1995 .

[18]  V. Paxson,et al.  Growth trends in wide-area TCP connections , 1994, IEEE Network.

[19]  Carey L. Williamson,et al.  Trace-Driven Simulation of Document Caching Strategies for Internet Web Servers , 1997, Simul..

[20]  Edward A. Fox,et al.  Removal Policies in Network Caches for World-Wide Web Documents , 1996, SIGCOMM.

[21]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[22]  Evangelos P. Markatos,et al.  Main Memory Caching of Web Documents , 1996, Comput. Networks.

[23]  Steven Glassman,et al.  A Caching Relay for the World Wide Web , 1994, Comput. Networks ISDN Syst..

[24]  Martin Arlitt,et al.  A Performance Study of Internet Web Servers , 1996 .

[25]  Jeffrey C. Mogul,et al.  Improving HTTP Latency , 1995, Comput. Networks ISDN Syst..

[26]  Kimberly C. Claffy,et al.  Web Traffic Characterization: An Assesment of the Impact of Caching Documents from NCSA's Web Server , 1995, Comput. Networks ISDN Syst..

[27]  Jeffrey C. Mogul,et al.  The case for persistent-connection HTTP , 1995, SIGCOMM '95.

[28]  Jean-Chrysostome Bolot,et al.  Performance Engineering of the World Wide Web: Application to Dimensioning and Cache Design , 1996, Comput. Networks.

[29]  Jeffrey C. Mogul,et al.  Network Behavior of a Busy Web Server and its Clients , 1999 .

[30]  Sally Floyd,et al.  Wide-area traffic: the failure of Poisson modeling , 1994 .

[31]  Virgílio A. F. Almeida,et al.  Measuring the behaviour of a world-wide web server , 1997, HPN.

[32]  John Dilley Hewlett-Packard Web Server Workload Characterization , 1996 .

[33]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[34]  PaxsonVern,et al.  Wide area traffic , 1994 .