Runtime measurements in the cloud

One of the main reasons why cloud computing has gained so much popularity is due to its ease of use and its ability to scale computing resources on demand. As a result, users can now rent computing nodes on large commercial clusters through several vendors, such as Amazon and rackspace. However, despite the attention paid by Cloud providers, performance unpredictability is a major issue in Cloud computing for (1) database researchers performing wall clock experiments, and (2) database applications providing service-level agreements. In this paper, we carry out a study of the performance variance of the most widely used Cloud infrastructure (Amazon EC2) from different perspectives. We use established microbenchmarks to measure performance variance in CPU, I/O, and network. And, we use a multi-node MapReduce application to quantify the impact on real dataintensive applications. We collected data for an entire month and compare it with the results obtained on a local cluster. Our results show that EC2 performance varies a lot and often falls into two bands having a large performance gap in-between --- which is somewhat surprising. We observe in our experiments that these two bands correspond to the different virtual system types provided by Amazon. Moreover, we analyze results considering different availability zones, points in time, and locations. This analysis indicates that, among others, the choice of availability zone also influences the performance variability. A major conclusion of our work is that the variance on EC2 is currently so high that wall clock experiments may only be performed with considerable care. To this end, we provide some hints to users.

[1]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Simson L. Garfinkel,et al.  An Evaluation of Amazon's Grid Computing Services: EC2, S3, and SQS , 2007 .

[4]  Alain Abran,et al.  Criteria to Compare Cloud Computing with Current Database Technology , 2008, IWSM/Metrikon/Mensura.

[5]  Renato Figueiredo,et al.  Science Clouds: Early Experiences in Cloud Computing for Scientific Applications , 2008 .

[6]  Michael Stonebraker,et al.  A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.

[7]  Paolo Bientinesi,et al.  Can cloud computing reach the top500? , 2009, UCHPC-MAW '09.

[8]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[9]  Marin Litoiu,et al.  Performance model driven QoS guarantees and optimization in clouds , 2009, 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing.

[10]  Guillaume Pierre,et al.  EC2 Performance Analysis for Resource Provisioning of Service-Oriented Applications , 2009, ICSOC/ServiceWave Workshops.

[11]  Carsten Binnig,et al.  How is the weather tomorrow?: towards a benchmark for the cloud , 2009, DBTest '09.

[12]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[13]  Thomas Sandholm,et al.  What's inside the Cloud? An architectural map of the Cloud landscape , 2009, 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing.

[14]  Hovav Shacham,et al.  Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds , 2009, CCS.

[15]  Tim Kraska,et al.  An evaluation of alternative architectures for transaction processing in the cloud , 2010, SIGMOD Conference.

[16]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.