Characterization and Comparison of Google Cloud Load versus Grids

A new era of Cloud Computing has emerged, but the characteristics of Cloud load in data centers is not perfectly clear. Yet this characterization is critical for the design of novel Cloud job and resource management systems. In this paper, we comprehensively characterize the job/task load and host load in a real-world production data center at Google Inc. We use a detailed trace of over 25 million tasks across over 12,500 hosts. We study the differences between a Google data center and other Grid/HPC systems, from the perspective of bothwork load (w.r.t. jobs and tasks) andhost load (w.r.t. machines). In particular, we study the job length, job submission frequency, and the resource utilization of jobs in the different systems, and also investigate valuable statistics of machine’s maximum load, queue state and relative usage levels, with different job priorities and resource attributes. We find that the Google data center exhibits finer resource allocation with respect to CPU and memory than that of Grid/HPC systems. Google jobs are always submitted with much higher frequency and they are much shorter than Grid jobs. As such, Google host load exhibits higher variance and noise.

[1]  A. Gilles,et al.  The Art of Computer Systems Performance Analysis (Techniques for Experimental Design, Measurement, Simulation, and Modeling) , 1992 .

[2]  Paulo S. R. Diniz,et al.  Adaptive Filtering: Algorithms and Practical Implementation , 1997 .

[3]  Richard Koch,et al.  The 80/20 Principle: The Secret of Achieving More With Less , 1998 .

[4]  Gilles Fedak,et al.  XtremWeb: Building an Experimental Platform for Global Computing , 2000, GRID.

[5]  Ju Wang,et al.  The entropia virtual machine for desktop grids , 2005, VEE '05.

[6]  Richard Heusdens,et al.  Analysis and Synthesis of Pseudo-Periodic Job Arrivals in Grids: A Matching Pursuit Approach , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[7]  Gilles Fedak,et al.  Characterizing resource availability in enterprise desktop grids , 2007, Future Gener. Comput. Syst..

[8]  Martin Schulz,et al.  A regression-based approach to scalability prediction , 2008, ICS '08.

[9]  Hui Li,et al.  Workload dynamics on clusters and grids , 2008, The Journal of Supercomputing.

[10]  Enis Afgan,et al.  Exploiting performance characterization of BLAST in the grid , 2010, Cluster Computing.

[11]  Chita R. Das,et al.  Modeling and synthesizing task placement constraints in Google compute clusters , 2011, SoCC.

[12]  Raouf Boutaba,et al.  Characterizing Task Usage Shapes in Google Compute Clusters , 2011 .

[13]  Xifeng Yan,et al.  Workload characterization and prediction in the cloud: A multiple time series approach , 2012, 2012 IEEE Network Operations and Management Symposium.