Multi-granularity sampling for simulating concurrent heterogeneous applications

Detailed or cycle-accurate/bit-accurate (CABA) simulation is a critical phase in the design flow of embedded systems. However, with increasing system complexity, full detailed simulation is prohibitively slower than the hardware being simulated. In this paper, we present an approach that uses the sampling technique to speed up the design flow of Multiprocessor System-on-Chip (MPSoC) systems. Based on the dynamic behavior of the applications running concurrently, our method dynamically chooses between multiple granularities of the sampling phase. The similarities of the execution phases for all possible granularities are first analyzed, then transitions between phase overlaps are discretized. To facilitate the detection of repetitions, one phase, with an appropriate granularity, is chosen per process. Unlike most other proposals, the associated performance is usually accurate enough not to need repeated resampling. The use of checkpointing in conjunction with our approach is simplified because the amount of the needed disk space is significantly reduced. Experimental results show that the simulation of concurrent heterogeneous applications can be accelerated by a factor of up to 60x, while maintaining an average performance estimation error lower than 5%.

[1]  Jean-Luc Dekeyser,et al.  An MPSoC Performance Estimation Framework Using Transaction Level Modeling , 2007, 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007).

[2]  Wolfgang Rosenstiel,et al.  Cycle accurate binary translation for simulation acceleration in rapid prototyping of SoCs , 2005, Design, Automation and Test in Europe.

[3]  Adam Donlin,et al.  Transaction level modeling: flows and use models , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[4]  James E. Smith,et al.  A first-order superscalar processor model , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[5]  Luca Benini,et al.  Architectural Exploration of MPSoC Designs Based on an FPGA Emulation Framework , 2006 .

[6]  Lieven Eeckhout,et al.  Considering all starting points for simultaneous multithreading simulation , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[7]  Thomas F. Wenisch,et al.  Simulation sampling with live-points , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[8]  Vincenzo Catania,et al.  Efficient design space exploration for application specific systems-on-a-chip , 2007, J. Syst. Archit..

[9]  Brad Calder,et al.  Motivation for Variable Length Intervals and Hierarchical Phase Behavior , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[10]  Brad Calder,et al.  A co-phase matrix to guide simultaneous multithreading simulation , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.

[11]  Smaïl Niar,et al.  Adaptive Sampling for Efficient MPSoC Architecture Simulation , 2007, 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[13]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[14]  Rajesh K. Gupta,et al.  Phase guided sampling for efficient parallel application simulation , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[15]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[16]  Jorg Henkel,et al.  System-level exploration for pareto-optimal configurations in parameterized systems-on-a-chip , 2001, ICCAD 2001.

[17]  James E. Smith,et al.  Statistical simulation of symmetric multiprocessor systems , 2002, Proceedings 35th Annual Simulation Symposium. SS 2002.

[18]  Luca Benini,et al.  MPARM: Exploring the Multi-Processor SoC Design Space with SystemC , 2005, J. VLSI Signal Process..

[19]  Daniel A. Connors,et al.  Phase-Guided Small-Sample Simulation , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[20]  Lieven Eeckhout,et al.  Efficient Sampling Startup for SimPoint , 2006, IEEE Micro.

[21]  Thomas F. Wenisch,et al.  SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, ISCA '03.