Off-chip access localization for NoC-based multicores

In a network-on-chip based multicore, an off-chip data access needs to travel through the on-chip network, spending considerable amount of time within the chip (in addition to the memory access itself). Further, it also causes additional delays for on-chip accesses by creating contention on network resources. In this paper, we propose a compiler-guided off-chip data access localization strategy to ensure that, an off-chip access traverses a small number of links (hops) to reach the memory controller which governs the memory bank that holds the requested data. The results collected clearly emphasize the importance of localizing off-chip accesses.

[1]  Mor Harchol-Balter,et al.  Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[2]  Luca Benini,et al.  Networks on chips - technology and tools , 2006, The Morgan Kaufmann series in systems on silicon.

[3]  References , 1971 .

[4]  Mor Harchol-Balter,et al.  ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[5]  Uday Bondhugula,et al.  Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[6]  Mahmut T. Kandemir,et al.  A data layout optimization framework for NUCA-based multicores , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Doug Burger,et al.  An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.

[8]  Rudolf Eigenmann,et al.  SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance , 2001, WOMPAT.

[9]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.