Speculative pipelining for compute cloud programming

MapReduce job execution typically occurs in sequential phases of parallel steps. These phases can experience unpredictable delays when available computing and network capacities fluctuate or when there are large disparities in inter-node communication delays, as can occur on shared compute clouds. We propose a pipeline-based scheduling strategy, called speculative pipelining, which uses speculative prefetching and computing to minimize execution delays in subsequent stages due to varying resource availability. Our proposed method can mask the time required to perform speculative operations by overlapping with other ongoing operations. We introduce the notion of “open-option” prefetching, which, via coding techniques, allows speculative prefetching to begin even before knowing exactly which input will be needed. On a compute cloud testbed, we apply speculative pipelining to the Hadoop sorting benchmark and show that sorting time is shortened significantly.

[1]  S PlankJames,et al.  A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems , 1997 .

[2]  Randy H. Katz,et al.  Failure correction techniques for large disk arrays , 1989, ASPLOS III.

[3]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[4]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[5]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[6]  Rob Pike,et al.  Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..

[7]  James S. Plank A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems , 1997 .

[8]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[9]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[10]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[11]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[12]  Jin-Soo Kim,et al.  HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[13]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[14]  D Meiron,et al.  Data Analysis Challenges , 2008 .