Netrace: dependency-driven trace-based network-on-chip simulation

Chip multiprocessors (CMPs) and systems-on-chip (SOCs) are expected to grow in core count from, a few today to hundreds or more. Since efficient on-chip communication is a primary factor in the performance of large core-count systems, the research community has directed substantial attention to networks-on-chip (NOCs). Current NOC evaluation methodologies include analytical modeling, network simulation, and full-system simulation. However, as core count and system complexity grow, the deficiencies of each of these methods will limit their ability to meet the demands of developers and researchers. Developing efficient NOCs requires high-fidelity, low-overhead NOC evaluation techniques and metrics. To address these challenges, this paper describes a new trace-based network simulation methodology that captures dependencies between network messages observed in full-system simulation of multithreaded applications. We also introduce Netrace, a library of tools and traces that enables targeted NOC simulators to track and replay network messages and their dependencies.

[1]  Onur Mutlu,et al.  Preemptive Virtual Clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Onur Mutlu,et al.  Express Cube Topologies for on-Chip Interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[3]  William J. Dally,et al.  Design tradeoffs for tiled CMP on-chip networks , 2006, ICS '06.

[4]  William J. Dally,et al.  Worst-case Traffic for Oblivious Routing Functions , 2002, IEEE Computer Architecture Letters.

[5]  Radu Marculescu,et al.  An Initiative towards Open Network-on-Chip Benchmarks , 2007 .

[6]  Chita R. Das,et al.  Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Niraj K. Jha,et al.  In-Network Snoop Ordering (INSO): Snoopy coherence on unordered interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[8]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[9]  Shai Rubin,et al.  Focusing processor policies via critical-path prediction , 2001, ISCA 2001.

[10]  Jan Madsen,et al.  A Reactive and Cycle-True IP Emulator for MPSoC Exploration , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  David A. Wood,et al.  Variability in architectural simulations of multi-threaded workloads , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[12]  Jan Madsen,et al.  Network traffic generator model for fast network-on-chip simulation , 2005, Design, Automation and Test in Europe.

[13]  Natalie D. Enright Jerger,et al.  SCARAB: A single cycle adaptive routing and bufferless network , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[14]  Natalie D. Enright Jerger,et al.  Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.

[15]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[16]  Antoine Fraboulet,et al.  Automatic phase detection for stochastic on-chip traffic generation , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).