论文信息 - Turbine: a distributed-memory dataflow engine for extreme-scale many-task applications

Turbine: a distributed-memory dataflow engine for extreme-scale many-task applications

Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely-coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. "Many-task" programming models such as functional parallel dataflow may be used at the upper layer to generate massive numbers of tasks, each of which generates significant tighly-coupled parallelism at the lower level via multithreading, message passing, and/or partitioned global address spaces. At large scales, however, the management of task distribution, data dependencies, and inter-task data movement is a significant performance challenge. In this work, we describe Turbine, a new highly scalable and distributed many-task dataflow engine. Turbine executes a generalized many-task intermediate representation with automated self-distribution, and is scalable to multi-petaflop infrastructures. We present here the architecture of Turbine and its performance on highly concurrent systems.

[1] Alok Choudhary,et al. Synergistic Challenges in Data-Intensive Science and Exascale Computing: DOE ASCAC Data Subcommittee Report , 2013 .

[2] Brad Fitzpatrick,et al. Distributed caching with memcached , 2004 .

[3] Zhen Li,et al. Comet: a scalable coordination space for decentralized distributed environments , 2005, Second International Workshop on Hot Topics in Peer-to-Peer Systems.

[4] Timothy G. Armstrong,et al. ExM : High-level dataflow programming for extreme-scale systems , 2012 .

[5] Robert D. Blumofe,et al. Adaptive and Reliable ParallelComputing9 Networks of Workstations , 1997 .

[6] Steven Hand,et al. CIEL: A Universal Execution Engine for Distributed Data-Flow Computing , 2011, NSDI.

[7] PikeRob,et al. Interpreting the data , 2005 .

[8] Nicholas Carriero,et al. Linda and Friends , 1986, Computer.

[9] Sriram Krishnamoorthy,et al. Scioto: A Framework for Global-View Task Parallelism , 2008, 2008 37th International Conference on Parallel Processing.

[10] Gregor von Laszewski,et al. A Java commodity grid kit , 2001, Concurr. Comput. Pract. Exp..

[11] Daniel S. Katz,et al. Swift: A language for distributed parallel scripting , 2011, Parallel Comput..

[12] Zhao Zhang,et al. Toward loosely coupled programming on petascale systems , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[13] Weijia Xu,et al. Composing and executing parallel data-flow graphs with shell pipes , 2009, WORKS '09.

[14] Margaret H. Wright,et al. The opportunities and challenges of exascale computing , 2010 .

[15] Carl Kesselman,et al. What makes workflows work in an opportunistic environment? , 2006, Concurr. Comput. Pract. Exp..

[16] Geoffrey C. Fox,et al. Twister: a runtime for iterative MapReduce , 2010, HPDC '10.

[17] Steven Hand,et al. Scripting the Cloud with Skywriting , 2010, HotCloud.

[18] Miron Livny,et al. What makes workflows work in an opportunistic environmentq: Research Articles , 2006 .

[19] Michael McCool,et al. Structured parallel programming with deterministic patterns , 2010 .

[20] Zhao Zhang,et al. Towards Loo on , 2008 .

[21] Brent B Welch,et al. Practical Programming in Tcl and Tk , 1999 .

[22] Prashant Malik,et al. Cassandra: a decentralized structured storage system , 2010, OPSR.

[24] Zhao Zhang,et al. Parallel Scripting for Applications at the Petascale and Beyond , 2009, Computer.

[25] John Shalf,et al. Exascale Computing Technology Challenges , 2010, VECPAR.