ParTejas

In this paper, we present the design of a novel multicore simulator called ParTejas. It is a fast shared memory based parallel simulator written in Java. Unlike recently released parallel simulators that mainly rely on sampling, high level models, and highly relaxed synchronization, we primarily rely on novel concurrent data structures. In specific, we use a lock free parallel slot scheduler for synchronizing the accesses of multiple threads at a shared resource, and we use flexible barriers known as phasers to relax synchronization within bounds. We leverage additional language specific features of Java, and demonstrate a mean speedup of 11.8X (simulation speed of 4-8 MIPS) with 64 threads for a suite of Splash2 and Parsec benchmarks.

[1]  Gary L. Peterson,et al.  Myths About the Mutual Exclusion Problem , 1981, Inf. Process. Lett..

[2]  David R. Kaeli,et al.  Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[3]  Prathmesh Kallurkar,et al.  Tejas: A java based versatile micro-architectural simulator , 2015, 2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS).

[4]  Thomas F. Wenisch,et al.  SimFlex: Statistical Sampling of Computer System Simulation , 2006, IEEE Micro.

[5]  Jianwei Chen,et al.  SlackSim: a platform for parallel simulations of CMPs on CMPs , 2009, CARN.

[6]  Smruti R. Sarangi,et al.  ParTejas: A parallel simulator for multicore processors , 2014, ISPASS.

[7]  Paolo Faraboschi,et al.  COTSon: infrastructure for full system simulation , 2009, OPSR.

[8]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[9]  Christoforos E. Kozyrakis,et al.  ZSim: fast and accurate microarchitectural simulation of thousand-core systems , 2013, ISCA.

[10]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[11]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[12]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[13]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[14]  Matt T. Yourst PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[15]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[16]  J. Mark Bull,et al.  Benchmarking Java against C and Fortran for scientific applications , 2001, JGI '01.

[17]  Stijn Eyerman,et al.  Interval simulation: Raising the level of abstraction in architectural simulation , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[18]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[19]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, ISCA '90.

[20]  Tudor David,et al.  Everything you always wanted to know about synchronization but were afraid to ask , 2013, SOSP.

[21]  Todd M. Austin,et al.  SimpleScalar: An Infrastructure for Computer System Modeling , 2002, Computer.

[22]  Laxmikant V. Kalé,et al.  BigSim: a parallel simulator for performance prediction of extremely large parallel machines , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[23]  Shobhit Kanaujia,et al.  FastMP: A Multi-core Simulation Methodology , 2006 .

[24]  Smruti R. Sarangi,et al.  Lock-Free and Wait-Free Slot Scheduling Algorithms , 2016, IEEE Transactions on Parallel and Distributed Systems.

[25]  Anoop Gupta,et al.  Complete computer system simulation: the SimOS approach , 1995, IEEE Parallel Distributed Technol. Syst. Appl..

[26]  Christopher J. Hughes,et al.  RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors , 2002, Computer.

[27]  Jose Renau,et al.  ESESC: A fast multicore simulator using Time-Based Sampling , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[28]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[29]  James R. Larus,et al.  Wisconsin Wind Tunnel II: a fast, portable parallel architecture simulator , 2000, IEEE Concurr..

[30]  James R. Larus,et al.  The Wisconsin Wind Tunnel: virtual prototyping of parallel computers , 1993, SIGMETRICS '93.