Characterizing Network Traffic in a Cluster-based, Multi-tier Data Center

With the increasing use of various Web-based services, design of high performance, scalable and dependable data centers has become a critical issue. Recent studies show that a clustered, multi-tier architecture is a cost-effective approach to design such servers. Since these servers are highly distributed and complex, understanding the workloads driving them is crucial for the success of the ongoing research to improve them. In view of this, there has been a significant amount of work to characterize the workloads of Web-based services. However, all of the previous studies focus on a high level view of these servers, and analyze request-based or session-based characteristics of the workloads. In this paper, we focus on the characteristics of the network behavior within a clustered, multi-tiered data center. Using a real implementation of a clustered three-tier data center, we analyze the arrival rate and inter-arrival time distribution of the requests to individual server nodes, the network traffic between tiers, and the average size of messages exchanged between tiers. The main results of this study are; (1) in most cases, the request inter-arrival rates follow log-normal distribution, and self-similarity exists when the data center is heavily loaded, (2) message sizes can be modeled by the log-normal distribution, and (3) service times fit reasonably well with the Pareto distribution and show heavy tailed behavior at heavy loads.

[1]  Willy Zwaenepoel,et al.  Performance and scalability of EJB applications , 2002, OOPSLA '02.

[2]  Virgílio A. F. Almeida,et al.  A hierarchical and multiscale approach to analyze E-business workloads , 2003, Perform. Evaluation.

[3]  J. Chase,et al.  Data Center Workload Monitoring , Analysis , and Emulation , 2005 .

[4]  K. Keeton,et al.  Evaluating servers with commercial workloads , 2003, Computer.

[5]  Daniel A. Menascé Workload Characterization , 2003, IEEE Internet Comput..

[6]  Paul Barford,et al.  Generating representative Web workloads for network and server performance evaluation , 1998, SIGMETRICS '98/PERFORMANCE '98.

[7]  Walter Willinger,et al.  Self-Similarity in High-Speed Packet Traffic: Analysis and Modeling of Ethernet Traffic Measurements , 1995 .

[8]  Virgílio A. F. Almeida,et al.  In search of invariants for e-business workloads , 2000, EC '00.

[9]  Daniel A. Menascé,et al.  Fractal Characterization of Web Workloads , 2002 .

[10]  Martin F. Arlitt,et al.  Web server workload characterization: the search for invariants , 1996, SIGMETRICS '96.

[11]  Virgílio A. F. Almeida,et al.  Characterizing reference locality in the WWW , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[12]  27th International Conference on Distributed Computing Systems Workshops (ICDCS 2007 Workshops), June 25-29, 2007, Toronto, Ontario, Canada , 2007, ICDCS Workshops.

[13]  Milo M. K. Martin,et al.  Simulating a $ 2 M Commercial Server on a $ 2 K PC T , 2001 .

[14]  Sameh Elnikety,et al.  Performance Comparison of Middleware Architectures for Generating Dynamic Web Content , 2003, Middleware.

[15]  Adriano M. Pereira,et al.  Assessing the impact of reactive workloads on the performance of Web applications , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[16]  Murad S. Taqqu,et al.  On estimating the intensity of long-range dependence in finite and infinite variance time series , 1998 .

[17]  Allen B. Downey,et al.  Evidence for long-tailed distributions in the internet , 2001, IMW '01.

[18]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[19]  Azer Bestavros,et al.  Explaining World Wide Web Traffic Self-Similarity , 1995 .

[20]  Martin Arlitt,et al.  A workload characterization study of the 1998 World Cup Web site , 2000, IEEE Netw..

[21]  Asser N. Tantawi,et al.  An analytical model for multi-tier internet services and its applications , 2005, SIGMETRICS '05.

[22]  Virgílio A. F. Almeida,et al.  A methodology for workload characterization of E-commerce sites , 1999, EC '99.

[23]  Philip S. Yu,et al.  The state of the art in locally distributed Web-server systems , 2002, CSUR.

[24]  Xuan Wang,et al.  A Contribution Towards Solving the Web Workload Puzzle , 2006, International Conference on Dependable Systems and Networks (DSN'06).