FORMLESS: scalable utilization of embedded manycores in streaming applications

Variants of dataflow specification models are widely used to synthesize streaming applications for distributed-memory parallel processors. We argue that current practice of specifying streaming applications using rigid dataflow models, implicitly prohibits a number of platform oriented optimizations and hence limits portability and scalability with respect to number of processors. We motivate Functionally-cOnsistent stRucturally-MalLEabe Streaming Specification, dubbed FORMLESS, which refers to raising the abstraction level beyond fixed-structure dataflow to address its portability and scalability limitations. To demonstrate the potential of the idea, we develop a design space exploration scheme to customize the application specification to better fit the target platform. Experiments with several common streaming case studies demonstrate improved portability and scalability over conventional dataflow specification models, and confirm the effectiveness of our approach.

[1]  Ed F. Deprettere,et al.  Exploring Embedded-Systems Architectures with Artemis , 2001, Computer.

[2]  E.A. Lee,et al.  Synchronous data flow , 1987, Proceedings of the IEEE.

[3]  Marc Pouzet,et al.  Towards a higher-order synchronous data-flow language , 2004, EMSOFT '04.

[4]  Sander Stuijk,et al.  Throughput-Buffering Trade-Off Exploration for Cyclo-Static and Synchronous Dataflow Graphs , 2008, IEEE Transactions on Computers.

[5]  Sander Stuijk,et al.  A scenario-aware data flow model for combined long-run average and worst-case performance analysis , 2006, Fourth ACM and IEEE International Conference on Formal Methods and Models for Co-Design, 2006. MEMOCODE '06. Proceedings..

[6]  Twan Basten,et al.  Reactive process networks , 2004, EMSOFT '04.

[7]  Soheil Ghiasi,et al.  System-Level Performance Estimation for Application-Specific MPSoC Interconnect Synthesis , 2008, 2008 Symposium on Application Specific Processors.

[8]  Alberto L. Sangiovanni-Vincentelli,et al.  Benefits and challenges for platform-based design , 2004, Proceedings. 41st Design Automation Conference, 2004..

[9]  Shuvra S. Bhattacharyya,et al.  Parameterized dataflow modeling for DSP systems , 2001, IEEE Trans. Signal Process..

[10]  Gerard J. M. Smit,et al.  Buffer Capacity Computation for Throughput Constrained Streaming Applications with Data-Dependent Inter-Task Communication , 2008, 2008 IEEE Real-Time and Embedded Technology and Applications Symposium.

[11]  T. Mohsenin,et al.  A 167-processor 65 nm computational platform with per-processor dynamic supply voltage and dynamic clock frequency scaling , 2008, 2008 IEEE Symposium on VLSI Circuits.

[12]  Michael I. Gordon Compiler techniques for scalable performance of stream programs on multicore architectures , 2010 .

[13]  Tinoosh Mohsenin,et al.  Multi-Split-Row Threshold decoding implementations for LDPC codes , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[14]  Marc Geilen,et al.  Reduction techniques for Synchronous Dataflow graphs , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[15]  Walid Taha,et al.  A Gentle Introduction to Multi-stage Programming , 2003, Domain-Specific Program Generation.

[16]  Soheil Ghiasi,et al.  Automated software synthesis for streaming applications on embedded manycore processors , 2011 .

[17]  Pascal Fradet,et al.  SPDF: A schedulable parametric data-flow MoC , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[18]  Edward A. Lee,et al.  Software Synthesis from Dataflow Graphs , 1996 .

[19]  David Wentzlaff,et al.  Processor: A 64-Core SoC with Mesh Interconnect , 2010 .

[20]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).