Mapping streaming applications on multiprocessors with time-division-multiplexed network-on-chip

This paper addresses mapping of streaming applications (such as MPEG) on multiprocessor platforms with time-division-multiplexed network-on-chip. In particular, we solve processor selection, path selection and router configuration problems. Given the complexity of these problems, state of the art approaches in this area largely rely on greedy heuristics, which do not guarantee optimality. Our approach is based on a constraint programming formulation that merges a number of steps, usually tackled in sequence in classic approaches. Thus, our method has the potential of finding optimal solutions with respect to resource usage under throughput constraints. The experimental evaluation presented in here shows that our approach is capable of exploring a range of solutions while giving the designer the opportunity to emphasize the importance of various design metrics.

[1]  David Wentzlaff,et al.  Processor: A 64-Core SoC with Mesh Interconnect , 2010 .

[2]  Krzysztof Kuchcinski,et al.  Constraints-driven scheduling and resource assignment , 2003, TODE.

[3]  Lothar Thiele,et al.  Mapping Applications to Tiled Multiprocessor Embedded Systems , 2007, Seventh International Conference on Application of Concurrency to System Design (ACSD 2007).

[4]  Radu Marculescu,et al.  Energy-aware mapping for tile-based NoC architectures under performance constraints , 2003, ASP-DAC '03.

[5]  Krzysztof Kuchcinski,et al.  Design space exploration for streaming applications on multiprocessors with guaranteed service NoC , 2013, NoCArc '13.

[6]  Amit Kumar Singh,et al.  Mapping on multi/many-core systems: Survey of current and emerging trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[7]  Mike Butts,et al.  Synchronization through Communication in a Massively Parallel Processor Array , 2007, IEEE Micro.

[8]  Srinivasan Murali,et al.  Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[9]  Jean A. Peperstraete,et al.  Cycle-static dataflow , 1996, IEEE Trans. Signal Process..

[10]  Gerard J. M. Smit,et al.  Max-Plus Algebraic Throughput Analysis of Synchronous Dataflow Graphs , 2012, 2012 38th Euromicro Conference on Software Engineering and Advanced Applications.

[11]  Jörn W. Janneck,et al.  Profiling of Dataflow Programs Using Post Mortem Causation Traces , 2012, 2012 IEEE Workshop on Signal Processing Systems.

[12]  Kees G. W. Goossens,et al.  Aelite: A flit-synchronous Network on Chip with composable and predictable services , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[13]  Axel Jantsch,et al.  Slot allocation using logical networks for TDM virtual-circuit configuration for network-on-chip , 2007, ICCAD 2007.

[14]  Pierre G. Paulin,et al.  Programming challenges & solutions for multi-processor SoCs: An industrial perspective , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[15]  Jian Wang,et al.  Bandwidth-Aware Application Mapping for NoC-Based MPSoCs , 2011 .

[16]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[17]  Sander Stuijk,et al.  SDF^3: SDF For Free , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[18]  Luca Benini,et al.  Throughput Constraint for Synchronous Data Flow Graphs , 2009, CPAIOR.

[19]  Kees G. W. Goossens,et al.  A TDM NoC supporting QoS, multicast, and fast connection set-up , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[20]  Paul Shaw,et al.  A Constraint for Bin Packing , 2004, CP.

[21]  Willem Jan van Hoeve,et al.  An efficient generic network flow constraint , 2011, SAC '11.

[22]  Edward A. Lee,et al.  Scheduling dynamic dataflow graphs with bounded memory using the token flow model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  H. Peter Hofstee,et al.  Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..

[24]  Luca Benini,et al.  An efficient and complete approach for throughput-maximal SDF allocation and scheduling on multi-core platforms , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[25]  Shuvra S. Bhattacharyya,et al.  Embedded Multiprocessors: Scheduling and Synchronization , 2000 .

[26]  Sander Stuijk,et al.  Multiprocessor Resource Allocation for Throughput-Constrained Synchronous Dataflow Graphs , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[27]  Johan Eker,et al.  CAL language report: Specification of the CAL actor language , 2003 .

[28]  Sander Stuijk,et al.  Throughput Analysis of Synchronous Data Flow Graphs , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[29]  Kees G. W. Goossens,et al.  A Unified Approach to Mapping and Routing on a Network-on-Chip for Both Best-Effort and Guaranteed Service Traffic , 2007, VLSI Design.

[30]  Jürgen Teich,et al.  Mapping of applications to MPSoCs , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[31]  Zhigang Mao,et al.  An application specific NoC mapping for optimized delay , 2006, International Conference on Design and Test of Integrated Systems in Nanoscale Technology, 2006. DTIS 2006..

[32]  S. Stuijk Predictable mapping of streaming applications on multiprocessors , 2007 .

[33]  E.A. Lee,et al.  Synchronous data flow , 1987, Proceedings of the IEEE.