Parallel MPSoC Simulation and Architecture Evaluation

In order to exploit the parallelism of a multi-core host machines, this chapter introduces four novel parallel discrete-event simulation techniques, which exploit the parallelism of the simulated target architectures and applications for parallel simulation on the host machine. In order to guarantee timing results equal to sequential simulation, a correct synchronization and activation of the parallel host threads are required, which is differently realized for each of the four proposed parallelization techniques. Furthermore, parallel simulation allows evaluating different architectural design choices such as the number of tiles, internal tile structure, and selection of cores within each tile. Here, case studies regarding performance and costs trade-offs of different heterogeneous invasive architecture variants are presented. The combination of the provided simulation techniques provides a holistic simulation approach for modern multi- and many-core architectures that is fast and accurate enough in timing to simulate parallel invasive applications so to gain valuable insight into their dynamic behavior and to evaluate different architecture alternatives. The reader will understand the presented concepts for modeling and accelerating the simulation of different hardware components on architecture level and how to combine them to a unified full-system simulation.

[1]  Jürgen Becker,et al.  Multiprocessor System-on-Chip - Hardware Design and Tool Integration , 2011, Multiprocessor System-on-Chip.

[2]  Paolo Faraboschi,et al.  COTSon: infrastructure for full system simulation , 2009, OPSR.

[3]  Jürgen Teich,et al.  Execution-driven parallel simulation of PGAS applications on heterogeneous tiled architectures , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[4]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[5]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[6]  Jürgen Teich,et al.  Invasive Algorithms and Architectures Invasive Algorithmen und Architekturen , 2008, it Inf. Technol..

[7]  Hai Jin,et al.  PCantorSim: Accelerating parallel architecture simulation through fractal-based sampling , 2013, TACO.

[8]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[9]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[10]  Jürgen Teich,et al.  Fast architecture evaluation of heterogeneous MPSoCs by host-compiled simulation , 2012, Map2MPSoC/SCOPES.

[11]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[12]  Matt T. Yourst PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[13]  Srinivas Devadas,et al.  Scalable, accurate multicore simulation in the 1000-core era , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.

[14]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX Annual Technical Conference, FREENIX Track.

[15]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[16]  Tianshi Chen,et al.  Deterministic Replay Using Global Clock , 2013, TACO.

[17]  James R. Larus,et al.  Wisconsin Wind Tunnel II: a fast, portable parallel architecture simulator , 2000, IEEE Concurr..

[18]  James R. Larus,et al.  The Wisconsin Wind Tunnel: virtual prototyping of parallel computers , 1993, SIGMETRICS '93.

[19]  Richard M. Fujimoto,et al.  Parallel discrete event simulation , 1990, CACM.

[20]  Jung Ho Ahn,et al.  How to simulate 1000 cores , 2009, CARN.

[21]  Jürgen Teich,et al.  The Invasive Network on Chip - A Multi-Objective Many-Core Communication Infrastructure , 2014, ARCS Workshops.