Optimizing parallel simulation of multicore systems using domain-specific knowledge

This paper presents two optimization techniques for the basic Null-message algorithm in the context of parallel simulation of multicore computer architectures. Unlike the general, application-independent optimization methods, these are application-specific optimizations that make use of system properties of the simulation application. We demonstrate in two aspects that the domain-specific knowledge offers great potential for optimization. First, it allows us to send Null-messages much less eagerly, thus greatly reducing the amount of Null-messages. Second, the internal state of the simulation application allows us to make conservative forecast of future outgoing events. This leads to the creation of an enhanced synchronization algorithm called Forecast Null-message algorithm, which, by combining the forecast from both sides of a link, can greatly improve the simulation look-ahead. Compared with the basic Null-message algorithm, our optimizations greatly reduce the number of Null-messages and increase simulation performance significantly as a result. On a subset of the PARSEC benchmarks, a maximum speedup of about 6 is achieved with 17 LPs.

[1]  Charles L. Seitz,et al.  Variants of the Chandy-Misra-Bryant Distributed Discrete-Event Simulation Algorithm , 1988 .

[2]  David Garlan,et al.  Documenting software architectures: views and beyond , 2002, 25th International Conference on Software Engineering, 2003. Proceedings..

[3]  R.M. Fujimoto,et al.  Parallel and distributed simulation systems , 2001, Proceeding of the 2001 Winter Simulation Conference (Cat. No.01CH37304).

[4]  Jayadev Misra,et al.  Distributed discrete-event simulation , 1986, CSUR.

[5]  Alan D. George,et al.  Parallel simulation of chip-multiprocessor architectures , 2002, TOMC.

[6]  Christian Bienia,et al.  PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[7]  Ronald C. de Vries,et al.  Reducing Null Messages in Misra's Distributed Discrete Event Simulation Method , 1990, IEEE Trans. Software Eng..

[8]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[9]  Jianwei Chen,et al.  Adaptive and Speculative Slack Simulations of CMPs on CMPs , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[10]  Sudhakar Yalamanchili,et al.  A universal parallel front-end for execution driven microarchitecture simulation , 2012, RAPIDO '12.

[11]  Richard M. Fujimoto,et al.  Parallel and Distribution Simulation Systems , 1999 .

[12]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[13]  Gabriel H. Loh,et al.  Zesto: A cycle-level simulator for highly detailed microarchitecture exploration , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[14]  Kunle Olukotun,et al.  Multicore Processors and Systems , 2009, Integrated Circuits and Systems.

[15]  Sudhakar Yalamanchili,et al.  Interconnection Networks , 2011, Encyclopedia of Parallel Computing.

[16]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[17]  Bruce Jacob,et al.  The structural simulation toolkit , 2006, PERV.