Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications

Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. “Many-task” programming models such as functional parallel dataflow may be used at the upper layer to generate massive numbers of tasks, each of which generates significant tightly coupled parallelism at the lower level through multithreading, message passing, and/or partitioned global address spaces. At large scales, however, the management of task distribution, data dependencies, and intertask data movement is a significant performance challenge. In this work, we describe Turbine, a new highly scalable and distributed many-task dataflow engine. Turbine executes a generalized many-task intermediate representation with automated self-distribution and is scalable to multi-petaflop infrastructures. We present here the architecture of Turbine and its performance on highly concurrent systems.

[1]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[2]  Gregor von Laszewski,et al.  A Java commodity grid kit , 2001, Concurr. Comput. Pract. Exp..

[3]  Geoffrey C. Fox,et al.  Twister: a runtime for iterative MapReduce , 2010, HPDC '10.

[4]  S. Dosanjh,et al.  Architectures and Technology for Extreme Scale Computing Report from the Workshop Node Architecture and Power Reduction Strategies , 2011 .

[5]  Douglas Thain,et al.  Harnessing parallelism in multicore clusters with the All-Pairs, Wavefront, and Makeflow abstractions , 2010, Cluster Computing.

[6]  Daniel S. Katz,et al.  Swift: A language for distributed parallel scripting , 2011, Parallel Comput..

[7]  Rob Pike,et al.  Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..

[8]  Weijia Xu,et al.  Composing and executing parallel data-flow graphs with shell pipes , 2009, WORKS '09.

[9]  Robert D. Blumofe,et al.  Adaptive and Reliable ParallelComputing9 Networks of Workstations , 1997 .

[10]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[11]  David M. Beazley,et al.  SWIG: An Easy to Use Tool for Integrating Scripting Languages with C and C++ , 1996, Tcl/Tk Workshop.

[12]  Zhao Zhang,et al.  Toward loosely coupled programming on petascale systems , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  John Shalf,et al.  Exascale Computing Technology Challenges , 2010, VECPAR.

[14]  Justin M. Wozniak,et al.  Coasters: Uniform Resource Provisioning and Access for Clouds and Grids , 2011, 2011 Fourth IEEE International Conference on Utility and Cloud Computing.

[15]  Victor M. Zavala,et al.  Scalable stochastic optimization of complex energy systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[16]  Zhen Li,et al.  Comet: a scalable coordination space for decentralized distributed environments , 2005, Second International Workshop on Hot Topics in Peer-to-Peer Systems.

[17]  Carl Kesselman,et al.  What makes workflows work in an opportunistic environment? , 2006, Concurr. Comput. Pract. Exp..

[18]  Alok Choudhary,et al.  Synergistic Challenges in Data-Intensive Science and Exascale Computing: DOE ASCAC Data Subcommittee Report , 2013 .

[19]  Daniel S. Katz,et al.  Swift/T: scalable data flow programming for many-task applications , 2013, PPoPP '13.

[20]  Daniel S. Katz,et al.  Scheduling many-task workloads on supercomputers: Dealing with trailing tasks , 2010, 2010 3rd Workshop on Many-Task Computing on Grids and Supercomputers.

[21]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[22]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[23]  Brent B Welch,et al.  Practical Programming in Tcl and Tk , 1999 .

[24]  William Gropp,et al.  An efficient format for nearly constant-time access to arbitrary time intervals in large trace files , 2008, Sci. Program..

[25]  Steven Hand,et al.  Scripting the Cloud with Skywriting , 2010, HotCloud.

[26]  Thomas Hérault,et al.  DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[27]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[28]  Mihai Anitescu,et al.  The parallel solution of dense saddle-point linear systems arising in stochastic programming , 2012, Optim. Methods Softw..

[29]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[30]  Sriram Krishnamoorthy,et al.  Scioto: A Framework for Global-View Task Parallelism , 2008, 2008 37th International Conference on Parallel Processing.

[31]  James W. Jones,et al.  Decision support system for agrotechnology transfer: DSSAT v3 , 1998 .

[32]  James A. Evans,et al.  Machine Science , 2010, Science.

[33]  Ewing Lusk,et al.  More scalability, less pain : A simple programming model and its implementation for extreme computing. , 2010 .

[34]  Zhao Zhang,et al.  Parallel Scripting for Applications at the Petascale and Beyond , 2009, Computer.

[35]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[36]  Nicholas Carriero,et al.  Linda and Friends , 1986, Computer.

[37]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[38]  Michael McCool,et al.  Structured parallel programming with deterministic patterns , 2010 .

[39]  Brad Fitzpatrick,et al.  Distributed caching with memcached , 2004 .

[40]  Steven Hand,et al.  CIEL: A Universal Execution Engine for Distributed Data-Flow Computing , 2011, NSDI.

[41]  Margaret H. Wright,et al.  The opportunities and challenges of exascale computing , 2010 .

[42]  Miron Livny,et al.  What makes workflows work in an opportunistic environmentq: Research Articles , 2006 .

[43]  Timothy G. Armstrong,et al.  ExM : High-level dataflow programming for extreme-scale systems , 2012 .

[44]  Mihai Anitescu,et al.  A preconditioning technique for Schur complement systems arising in stochastic optimization , 2012, Comput. Optim. Appl..