Dynamic scheduling of stream programs on embedded multi-core processors

Stream computing has emerged as an important model of computation for embedded system applications particularly in the multimedia and network processing domains. In recent past several programming languages and embedded multi-core processors have been proposed for streaming applications. This paper examines the execution and dynamic scheduling of stream programs on embedded multi-core processors. The paper addresses the problem in the context of a multi-tasking environment with a time varying allocation of processing elements for a particular streaming application. As a solution the paper proposes a two step approach where the stream program is first compiled to gather key application information, and to generate re-targetable code. A light weight dynamic scheduler incorporates the second stage of the approach. The dynamic scheduler utilizes the static information and available resources to assign or partition the application across the multi-core architecture. The objective of the dynamic scheduler is to maximize the throughput of the application, and it is sensitive to the resource (processing elements, scratch-pad memory, DMA bandwidth) constraints imposed by the target architecture. We evaluate the proposed approach by compiling and scheduling benchmark stream programs on a representative embedded multi-core processor. We present experimental results that evaluate the quality of the solutions generated by the proposed approach by comparisons with existing techniques.

[1]  Krishnan Srinivasan,et al.  ILP and heuristic techniques for system-level design on network processor architectures , 2007, TODE.

[2]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[3]  David Zhang,et al.  A lightweight streaming layer for multicore execution , 2008, CARN.

[4]  Zhaohui Du,et al.  Data and computation transformations for Brook streaming applications on multiprocessors , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[5]  Scott A. Mahlke,et al.  Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[6]  Edward A. Lee,et al.  Hierarchical static scheduling of dataflow graphs onto multiple processors , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Karam S. Chatha,et al.  A lightweight run-time scheduler for multitasking multicore stream applications , 2010, 2010 IEEE International Conference on Computer Design.

[8]  H. Peter Hofstee,et al.  Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..

[9]  Karam S. Chatha,et al.  Compilation of stream programs onto scratchpad memory based embedded multicore processors through retiming , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Karam S. Chatha,et al.  Compilation of stream programs for multicore processors that incorporate scratchpad memories , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[11]  Gerard J. M. Smit,et al.  Monotonicity and run-time scheduling , 2009, EMSOFT '09.

[12]  Scott A. Mahlke,et al.  Orchestrating the execution of stream programs on multicore platforms , 2008, PLDI '08.

[13]  Alexandros Stamatakis,et al.  Dynamic multigrain parallelization on the cell broadband engine , 2007, PPoPP.

[14]  Michael K. Chen,et al.  Shangri-La: achieving high performance from compiled network applications while enabling ease of programming , 2005, PLDI '05.

[15]  Carl D. Offner,et al.  TStreams : A Model of Parallel Computation ( Preliminary Report ) , .

[16]  Henry Hoffmann,et al.  A stream compiler for communication-exposed architectures , 2002, ASPLOS X.

[17]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[18]  Randima Fernando,et al.  The GeForce 6 series GPU architecture , 2005, SIGGRAPH Courses.