Parallel application sampling for accelerating MPSoC simulation

Multi-processor system-on-chip (MPSoC) simulators are many orders of magnitude slower than the hardware they simulate due to increasing architectural complexity. In this paper, we propose a new application sampling technique to accelerate the simulation of MPSoC design space exploration (DSE). The proposed technique dynamically combines simultaneously executed phases, thus generating a sampling unit. This technique accelerates the simulation by allowing the repeated combinations of parallel phases to be skipped. A complementary technique, called cluster synthesis, is also proposed to improve the simulation acceleration when the number of possible phase combinations increases. Our experimental results show that this technique can accelerate the simulation up to a factor of 800 with a relatively small estimation error.

[1]  Luca Benini,et al.  MPARM: Exploring the Multi-Processor SoC Design Space with SystemC , 2005, J. VLSI Signal Process..

[2]  Shobhit Kanaujia,et al.  FastMP: A Multi-core Simulation Methodology , 2006 .

[3]  Lieven Eeckhout,et al.  Efficient Sampling Startup for SimPoint , 2006, IEEE Micro.

[4]  Lieven Eeckhout,et al.  Considering all starting points for simultaneous multithreading simulation , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[5]  Adam Donlin,et al.  Transaction level modeling: flows and use models , 2004, International Conference on Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004..

[6]  Roland E. Wunderlich,et al.  SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[7]  Rajesh K. Gupta,et al.  Phase guided sampling for efficient parallel application simulation , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[8]  Babak Falsafi,et al.  A complexity-effective architecture for accelerating full-system multiprocessor simulations using FPGAs , 2008, FPGA '08.

[9]  Brad Calder,et al.  How to use SimPoint to pick simulation points , 2004, PERV.

[10]  Jean-Luc Dekeyser,et al.  An MPSoC Performance Estimation Framework Using Transaction Level Modeling , 2007, 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007).

[11]  Grant Martin,et al.  Configurable Multi-Processor Platforms for Next Generation Embedded Systems , 2007, 2007 Asia and South Pacific Design Automation Conference.

[12]  Luca Benini,et al.  Architectural Exploration of MPSoC Designs Based on an FPGA Emulation Framework , 2006 .

[13]  S. Niar,et al.  FACSE: A framework for architecture and compilation space exploration , 2007, 2007 International Conference on Design & Technology of Integrated Systems in Nanoscale Era.

[14]  Smaïl Niar,et al.  Multi-granularity sampling for simulating concurrent heterogeneous applications , 2008, CASES '08.

[15]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[16]  James E. Smith,et al.  Statistical simulation of symmetric multiprocessor systems , 2002, Proceedings 35th Annual Simulation Symposium. SS 2002.

[17]  Wolfgang Rosenstiel,et al.  Cycle accurate binary translation for simulation acceleration in rapid prototyping of SoCs , 2005, Design, Automation and Test in Europe.

[18]  John Wawrzynek,et al.  BEE2: a high-end reconfigurable computing system , 2005, IEEE Design & Test of Computers.

[19]  John Flynn,et al.  Adapting the SPEC 2000 benchmark suite for simulation-based computer architecture research , 2001 .

[20]  James E. Smith,et al.  Automated design of application specific superscalar processors: an analytical approach , 2007, ISCA '07.

[21]  Brad Calder,et al.  Motivation for Variable Length Intervals and Hierarchical Phase Behavior , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[22]  Jörg Henkel,et al.  System-Level Exploration for Pareto-Optimal Configurations in Parameterized System-ona-Chip ( December 2002 ) , 2001 .

[23]  Per Stenström,et al.  Enhancing Multiprocessor Architecture Simulation Speed Using Matched-Pair Comparison , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[24]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[25]  Brad Calder,et al.  A co-phase matrix to guide simultaneous multithreading simulation , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.

[26]  Smaïl Niar,et al.  Adaptive Sampling for Efficient MPSoC Architecture Simulation , 2007, 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[27]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[28]  Vincenzo Catania,et al.  Efficient design space exploration for application specific systems-on-a-chip , 2007, J. Syst. Archit..

[29]  Thomas F. Wenisch,et al.  SimFlex: a fast, accurate, flexible full-system simulation framework for performance evaluation of server architecture , 2004, PERV.