Exploiting thread and data level parallelism for ultimate parallel SystemC simulation

Most parallel SystemC approaches have two limitations: (a) the user must manually separate all parallel threads to avoid data corruption due to race conditions, and (b) available hardware vector units are not utilized. In this paper, we present an advanced compiler infrastructure for automatic parallelization of SystemC models at the thread-level. In addition, our infrastructure exploits opportunities for data-level parallelization. Our experimental results show a nearly linear speedup of N×M, where N and M denote the thread and data-level factors, respectively. In turn, a 4-core multi-processor achieves a speedup of up to 8.8×, and a 60-core Xeon Phi processor reaches up to 212×.

[1]  Nicolas Ventroux,et al.  A new parallel SystemC kernel leveraging manycore architectures , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[3]  Rainer Leupers,et al.  SystemC-link: Parallel SystemC simulation using time-decoupled segments , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[4]  Xu Han,et al.  May-happen-in-parallel analysis based on segment graphs for safe ESL models , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  Tim Schmidt,et al.  Optimizing thread-to-core mapping on manycore platforms with distributed Tag Directories , 2015, The 20th Asia and South Pacific Design Automation Conference.

[6]  B. Mandelbrot FRACTAL ASPECTS OF THE ITERATION OF z →Λz(1‐ z) FOR COMPLEX Λ AND z , 1980 .

[7]  R. M. Fujimoto,et al.  Parallel discrete event simulation , 1989, WSC '89.

[8]  Rainer Leupers,et al.  Parallel SystemC Simulation for ESL Design , 2016, ACM Trans. Embed. Comput. Syst..

[9]  Franco Fummi,et al.  On the automatic generation of GPU-oriented software applications from RTL IPs , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[10]  Xu Han,et al.  Out-of-order parallel simulation for ESL design , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[11]  Christoph Roth,et al.  Adaptive Algorithm and Tool Flow for Accelerating System C on Many-Core Architectures , 2014, DSD.

[12]  Georg Glaeser,et al.  Temporal decoupling with error-bounded predictive quantum control , 2015, 2015 Forum on Specification and Design Languages (FDL).

[13]  Christoph Roth,et al.  Adaptive Algorithm and Tool Flow for Accelerating SystemC on Many-Core Architectures , 2014, 2014 17th Euromicro Conference on Digital System Design.

[14]  Hiren D. Patel,et al.  Parallel simulation of mixed-abstraction SystemC models on GPUs and multicore CPUs , 2012, 17th Asia and South Pacific Design Automation Conference.

[15]  Rainer Leupers,et al.  Time-decoupled parallel SystemC simulation , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).