HySIM: Towards a Scalable, Accurate and Fast Simulator for Manycore Processors

Simulation is the primary means to explore the design space of computer architecture. As the number of cores on a die increases, and the use of heterogeneous cores and on-chip networks leads to more and more complex systems, fast and cycle accurate simulation presents a formidable challenge. In this paper we propose the rationale for and design of a hybrid simulator called HySIM that takes RAMP Gold and moves the timing model of the memory subsystem to the host. This has four important advantages rst, it makes it possible to use multiple FPGAs in order to scale to a large number of cores, including heterogeneous cores and processors with hardware accelerators; second, it enables the modeling of detailed cache-coherence protocols by interfacing the simulator to a memory model such as Ruby; third, it provides a cycle-accurate model for the on-chip network that is exible enough to support dierent topologies, routing schemes, and router micro-architectures; and fourth, it frees up resources on the FPGA to increase the number of physical cores or to incorporate an on-chip L1 cache. We present preliminary results to validate HySIM and describe ongoing eort to improve its performance.

[1]  Michael Adler,et al.  HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[2]  George Kurian,et al.  Graphite: A distributed parallel simulator for multicores , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[3]  John Wawrzynek,et al.  BEE2: a high-end reconfigurable computing system , 2005, IEEE Design & Test of Computers.

[4]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[5]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[6]  Norman P. Jouppi,et al.  Core architecture optimization for heterogeneous chip multiprocessors , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[7]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.

[8]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[9]  Christoforos E. Kozyrakis,et al.  ZSim: fast and accurate microarchitectural simulation of thousand-core systems , 2013, ISCA.

[10]  John Wawrzynek,et al.  RAMP Blue: A Message-Passing Manycore System in FPGAs , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[11]  David A. Patterson,et al.  RAMP gold: An FPGA-based architecture simulator for multiprocessors , 2010, Design Automation Conference.

[12]  Dam Sunwoo,et al.  RAMP-White : An FPGA-Based Coherent Shared Memory Parallel Computer Emulator , 2007 .

[13]  Kunle Olukotun,et al.  FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[14]  Dam Sunwoo,et al.  FPGA-Accelerated Simulation Technologies (FAST): Fast, Full-System, Cycle-Accurate Simulators , 2007, MICRO.