Understanding Inefficiencies in Data-Intensive Computing

Abstract : New programming frameworks for scale-out parallel analysis, such as MapReduce and Hadoop, have become a cornerstone for exploiting large datasets. However, there has been little analysis of how such systems perform relative to the capabilities of the hardware on which they run. This paper describes a simple model of I/O resource consumption that predicts the ideal lowerbound runtime of a parallel dataflow on a particular set of hardware. Comparing actual system performance to the model's ideal prediction exposes the inefficiency of a scale-out system. Using a simplified dataflow processing tool called Parallel DataSeries we show that the model's ideal can be approached (i.e., that it is not wildly optimistic), but that a gap of up to 20% remains for workloads using up to 45 nodes. Guided by the model, we analyze inefficiencies exposed in both the disk and networking subsystems--issues that will be faced by any DISC system built atop popular commodity hardware and OSs.

[1]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[2]  John Wilkes,et al.  An introduction to disk drive modeling , 1994, Computer.

[3]  R. V. Meter Observing the effects of multi-zone disks , 1997 .

[4]  Steven D. Gribble,et al.  Robustness in complex systems , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[5]  Andrea C. Arpaci-Dusseau,et al.  Fail-stutter fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[6]  Gregory R. Ganger,et al.  Track-Aligned Extents: Matching Access Patterns to Disk Drive Characteristics , 2002, FAST.

[7]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[8]  Jeffrey C. Mogul,et al.  Emergent (mis)behavior vs. complex software systems , 2006, EuroSys.

[9]  Srinivasan Seshan,et al.  On application-level approaches to avoiding TCP throughput collapse in cluster-based storage systems , 2007, PDSW '07.

[10]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[11]  Gregory R. Ganger,et al.  Argon: Performance Insulation for Shared Storage Servers , 2007, FAST.

[12]  Randal E. Bryant,et al.  Data-Intensive Supercomputing: The case for DISC , 2007 .

[13]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[14]  Injong Rhee,et al.  CUBIC: a new TCP-friendly high-speed TCP variant , 2008, OPSR.

[15]  Srinivasan Seshan,et al.  Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems , 2008, FAST.

[16]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[17]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[18]  Guanying Wang,et al.  A simulation approach to evaluating design decisions in MapReduce setups , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[19]  Amar Phanishayee,et al.  Safe and effective fine-grained TCP retransmissions for datacenter communication , 2009, SIGCOMM '09.

[20]  Jasleen Kaur,et al.  RAPID: Shrinking the Congestion-Control Timescale , 2009, IEEE INFOCOM 2009.

[21]  Eric Anderson,et al.  DataSeries: an efficient, flexible data format for structured serial data , 2009, OPSR.

[22]  Peter Sanders,et al.  Scalable distributed-memory external sorting , 2009, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[23]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[24]  J. R. Santos,et al.  Ext 4 block and inode allocator improvements , 2010 .

[25]  Michael Stonebraker,et al.  MapReduce and parallel DBMSs: friends or foes? , 2010, CACM.

[26]  Amin Vahdat,et al.  TritonSort: A Balanced Large-Scale Sorting System , 2011, NSDI.

[27]  Gregory R. Ganger,et al.  Disks Are Like Snowflakes: No Two Are Alike , 2011, HotOS.