Stochastic Contention Level Simulation for Single-Chip Heterogeneous Multiprocessors

Single-chip systems, featuring multiple heterogeneous processors and a variety of communication and memory architectures, have emerged to satisfy the demand for networking, handheld computing, and other custom devices. When simulated at cycle-accurate level, these system models are slow to build and execute, severely limiting the number of design iterations that can be considered. A key challenge in raising the simulation level above the clock cycle is an effective method for estimating contention for shared resources such as memories and busses. This paper introduces a new level of design called the Stochastic Contention Level (SCL). Instead of considering shared resource accesses at the clock cycle granularity, SCL simulations operate on blocks that are thousands to millions of clock cycles long, stochastically capturing contention for shared resources via sampled access attributes, while still retaining an event-based simulation framework. The SCL approach results in speedups of 40{\times} over cycle-accurate simulation, with average simulation errors of less than one percent with 95 percent confidence intervals of about \pm 3 {\rm percent}, providing a unique combination of simulation capabilities, performance, and accuracy. This significant increase in simulation performance enables the system designers to explore more of the design space than possible with traditional simulation approaches.

[1]  James E. Smith,et al.  Statistical Simulation: Adding Efficiency to the Computer Designer's Toolbox , 2003, IEEE Micro.

[2]  Donald E. Thomas,et al.  Event-based re-training of statistical contention models for heterogeneous multiprocessors , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[3]  Mary K. Vernon,et al.  Parallel program performance prediction using deterministic task graph analysis , 2004, TOCS.

[4]  Brad Calder,et al.  Discovering and Exploiting Program Phases , 2003, IEEE Micro.

[5]  Donald E. Thomas,et al.  Modeling shared resource contention using a hybrid simulation/analytical approach , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[6]  David M. Brooks,et al.  Accurate and efficient regression modeling for microarchitectural performance and power prediction , 2006, ASPLOS XII.

[7]  Kapil Vaswani,et al.  Construction and use of linear regression models for processor performance analysis , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[8]  Mary K. Vernon,et al.  LoPC: modeling contention in parallel algorithms , 1997, PPOPP '97.

[9]  Donald E. Thomas,et al.  Shared Resource Access Attributes for High-Level Contention Models , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[10]  Luciano Lavagno,et al.  Software performance estimation strategies in a system-level design tool , 2000, Proceedings of the Eighth International Workshop on Hardware/Software Codesign. CODES 2000 (IEEE Cat. No.00TH8518).

[11]  J. Robert Jump,et al.  Cross-profiling as an efficient technique in simulating parallel computer systems , 1989, [1989] Proceedings of the Thirteenth Annual International Computer Software & Applications Conference.

[12]  Brad Calder,et al.  A co-phase matrix to guide simultaneous multithreading simulation , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.

[13]  Alex Bobrek,et al.  A statistical approach to contention modeling for high-level heterogeneous multiprocessor simulation , 2007 .

[14]  Michael J. Schulte,et al.  A New Era of Performance Evaluation , 2007, Computer.

[15]  Donald E. Thomas,et al.  A layered, codesign virtual machine approach to modeling computer systems , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[16]  Lieven Eeckhout,et al.  Evaluating the efficacy of statistical simulation for design space exploration , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[17]  Donald E. Thomas,et al.  Scenario-oriented design for single-chip heterogeneous multiprocessors , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[18]  Thomas F. Wenisch,et al.  SimFlex: Statistical Sampling of Computer System Simulation , 2006, IEEE Micro.

[19]  Alberto L. Sangiovanni-Vincentelli,et al.  A compilation-based software estimation scheme for hardware/software co-simulation , 1999, Proceedings of the Seventh International Workshop on Hardware/Software Codesign (CODES'99) (IEEE Cat. No.99TH8450).