Performance and Reliability Effects of Multi-tier Bidding on MapReduce in Auction-Based Clouds

Hadoop has become a central big data processing framework in today's cloud environments. Ensuring the good performance and cost effectiveness of Hadoop is crucial for the numerous applications that rely on it. In this paper we analyze Hadoop's performance in a multi-tier market-oriented cloud infrastructure known as Spot Instances. Amazon Spot Instances (SIs) are designed to deliver a cheap but transient alternative to fixed cost On-Demand (ODIs) instances. Recently, AWS introduced SIs in their managed Elastic Map Reduce offering. This managed framework lets the users design a multi-tier Hadoop architecture using fine grained controls to define the instance types both in terms of capacity, i.e. compute/storage/network, but also in terms of costs, i.e. ODI vs SI. The performance effects of such fine grained configurations are not yet well understood. First, we analyze a set of cluster configurations that can lead to important performance effects that can affect both the running time and the cost of such cloud Hadoop clusters. Second, we examine Hadoop's fault tolerance mechanisms and show the inadequacy of these mechanisms for multi-tier bidding architectures. Third, we discuss directions for making the Hadoop framework more market-aware without losing its focus on extreme scalability.

[1]  Artur Andrzejak,et al.  Decision Model for Cloud Computing under SLA Constraints , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[2]  Lars George,et al.  HBase: The Definitive Guide , 2011 .

[3]  Ronald C. Taylor An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics , 2010, BMC Bioinformatics.

[4]  Moussa Taifi,et al.  Banking on Decoupling: Budget-Driven Sustainability for HPC Applications on EC2 Spot Instances , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[5]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[6]  Wu-chun Feng,et al.  MOON: MapReduce On Opportunistic eNvironments , 2010, HPDC '10.

[7]  Rajkumar Buyya,et al.  Reliable Provisioning of Spot Instances for Compute-intensive Applications , 2011, 2012 IEEE 26th International Conference on Advanced Information Networking and Applications.

[8]  Bo Yang,et al.  X-RIME: Cloud-Based Large Scale Social Network Analysis , 2010, 2010 IEEE International Conference on Services Computing.

[9]  Zheng Shao,et al.  Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.

[10]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[11]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[12]  Artur Andrzejak,et al.  Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[13]  Christos Faloutsos,et al.  PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[14]  Archana Ganapathi,et al.  The Case for Evaluating MapReduce Performance Using Workload Suites , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.

[15]  Zheng Shao,et al.  Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[16]  T. S. Eugene Ng,et al.  Understanding the effects and implications of compute node related failures in hadoop , 2012, HPDC '12.

[17]  Asser N. Tantawi,et al.  See Spot Run: Using Spot Instances for MapReduce Workflows , 2010, HotCloud.

[18]  Stephen,et al.  An Evaluation of Sorting as a Supercomputer Benchmark (preliminary Version) , 1993 .

[19]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[20]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[21]  Rajkumar Buyya,et al.  Provisioning Spot Market Cloud Resources to Create Cost-Effective Virtual Clusters , 2011, ICA3PP.

[22]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[23]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[24]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[25]  Michele Mazzucco,et al.  Achieving Performance and Availability Guarantees with Spot Instances , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.

[26]  Christopher Olston,et al.  Building a HighLevel Dataflow System on top of MapReduce: The Pig Experience , 2009, Proc. VLDB Endow..

[27]  Komal Shringare,et al.  Apache Hadoop Goes Realtime at Facebook , 2015 .

[28]  Vinay Setty,et al.  Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) , 2010, Proc. VLDB Endow..

[29]  Abdallah Khreishah,et al.  SpotMPI: A Framework for Auction-Based HPC Computing Using Amazon Spot Instances , 2011, ICA3PP.