Adaptive Algorithm and Tool Flow for Accelerating System C on Many-Core Architectures

Within this paper an adaptive approach for parallel simulation of SystemC RTL models on future many-core architectures like the Single-chip Cloud Computer (SCC) from Intel is presented. It is based on a configurable parallel SystemC kernel that preserves the partial order defined by the SystemC delta cycles while avoiding global synchronization as far as possible. The underlying algorithm relies on a classification of existing communication relations between parallel processes. The type and topology of communication relations determines the type and number of causality conditions that need to be fulfilled during runtime. The parallel kernel is complemented by an automated tool flow that allows detecting relevant model-specific properties, performing a fine-grained model partitioning, classifying communication relations and configuring the kernel. Experiments by means of a MPSoC model show, that pure local synchronization can provide significant performance gains compared to global synchronization. Furthermore, the combination of local synchronization with fine-grained partitioning provides additional degrees of freedom for optimization.

[1]  Brian Beckman,et al.  Time warp operating system , 1987, SOSP '87.

[2]  Eddy Caron,et al.  Relaxing Synchronization in a Parallel SystemC Kernel , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[3]  Hiren D. Patel,et al.  Parallel simulation of mixed-abstraction SystemC models on GPUs and multicore CPUs , 2012, 17th Asia and South Pacific Design Automation Conference.

[4]  Sandeep K. Shukla,et al.  SCGPSim: A fast SystemC simulator on GPUs , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[5]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[6]  Christoph Roth,et al.  A SystemC modeling and simulation methodology for fast and accurate parallel MPSoC simulation , 2013, 2013 26th Symposium on Integrated Circuits and Systems Design (SBCCI).

[7]  Hiren D. Patel,et al.  Systemc-clang: An open-source framework for analyzing mixed-abstraction SystemC models , 2013, Proceedings of the 2013 Forum on specification and Design Languages (FDL).

[8]  Bastien Chopard,et al.  A Conservative Approach to SystemC Parallelization , 2006, International Conference on Computational Science.

[9]  Kevin Marquet,et al.  PinaVM: a systemC front-end based on an executable intermediate representation , 2010, EMSOFT '10.

[10]  Fernando Gehm Moraes,et al.  HeMPS - a framework for NoC-based MPSoC generation , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[11]  Rainer Leupers,et al.  parSC: Synchronous parallel SystemC simulation on multi-core host architectures , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[12]  Saurabh Dighe,et al.  A 48-Core IA-32 Processor in 45 nm CMOS Using On-Die Message-Passing and DVFS for Performance and Power Scaling , 2011, IEEE Journal of Solid-State Circuits.

[13]  Christoph Roth,et al.  A Framework for exploration of parallel SystemC simulation on the single-chip cloud computer , 2012, SimuTools.

[14]  Christoph Roth,et al.  Asynchronous parallel MPSoC simulation on the Single-Chip Cloud Computer , 2012, 2012 International Symposium on System on Chip (SoC).

[15]  Richard M. Fujimoto,et al.  Parallel discrete event simulation , 1990, CACM.

[16]  Andreas Gerstlauer,et al.  Introduction to Hardware-dependent Software design , 2009, 2009 Asia and South Pacific Design Automation Conference.

[17]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[18]  Franco Fummi,et al.  SAGA: SystemC acceleration on GPU architectures , 2012, DAC Design Automation Conference 2012.