Optimal Placement of Cores, Caches and Memory Controllers in On-Chip Network

Parallel programming is emerging fast and intensive applications need more resources, so there is a huge demand for on-chip multiprocessors. Accessing L1 caches beside the cores are the fastest after registers but the size of private caches cannot increase because of design, cost and technology limits. Then split I-cache and D-cache are used with shared LLC (last level cache). For a unified shared LLC, bus interface is not scalable, and it seems that distributed shared LLC (DSLLC) is a better choice. Most of papers assume a distributed shared LLC beside each core in on-chip network. Many works assume that DSLLCs are placed in all cores; however, we’ll show that this design ignores the effect of traffic congestion in on-chip network. In fact, our work focuses on optimal placement of cores, DSLLCs and even memory controllers to minimize the expected latency based on traffic load in a mesh on-chip network with fixed number of cores and total cache capacity. We try to do some analytical modeling deriving intended cost function and then optimize the mean delay of the on-chip network communication. This work is supposed to be verified using some traffic patterns that are run on CSIM simulator. Keywords—Network on Chip, Mathematical Modeling, Simulation, Optimization.

[1]  Mohammad Arjomand,et al.  Evaluating the Combined Impact of Node Architecture and Cloud Workload Characteristics on Network Traffic and Performance/Cost , 2015, 2015 IEEE International Symposium on Workload Characterization.

[2]  M. Kandemir,et al.  Modeling and Optimization of Straggling Mappers , 2014 .

[3]  David A. Wood,et al.  Managing Wire Delay in Large Chip-Multiprocessor Caches , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[4]  Zhen Fang,et al.  ACCESS: Smart scheduling for asymmetric cache CMPs , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[5]  Chita R. Das,et al.  A case for heterogeneous on-chip interconnects for CMPs , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[6]  Hamid Sarbazi-Azad,et al.  An accurate analytical model of adaptive wormhole routing in k-ary n-cubes interconnection networks , 2001, Perform. Evaluation.

[7]  Mahmut T. Kandemir,et al.  Enhancing L2 organization for CMPs with a center cell , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[8]  Hannu Tenhunen,et al.  Explorations of optimal core and cache placements for Chip Multiprocessor , 2011, 2011 NORCHIP.

[9]  Farshid Farhat Stochastic Modeling and Optimization of Stragglers in Mapreduce Framework , 2015 .

[10]  Radu Marculescu,et al.  An Analytical Approach for Network-on-Chip Performance Analysis , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Chita R. Das,et al.  Stochastic Modeling and Optimization of Stragglers , 2018, IEEE Transactions on Cloud Computing.

[12]  Chita R. Das,et al.  Hypercube Communication Delay with Wormhole Routing , 1994, IEEE Trans. Computers.

[13]  Natalie D. Enright Jerger,et al.  Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.

[14]  Mohamed Ould-Khaoua Message latency in the 2-dimensional mesh with wormhole routing , 1999, Microprocess. Microsystems.

[15]  Hamid Sarbazi-Azad,et al.  An accurate performance model of fully adaptive routing in wormhole-switched two-dimensional mesh multicomputers , 2007, Microprocess. Microsystems.

[16]  Babak Falsafi,et al.  Reactive NUCA: near-optimal block placement and replication in distributed caches , 2009, ISCA '09.

[17]  Hannu Tenhunen,et al.  Optimal number and placement of Through Silicon Vias in 3D Network-on-Chip , 2011, 14th IEEE International Symposium on Design and Diagnostics of Electronic Circuits and Systems.

[18]  Axel Jantsch,et al.  An Analytical Latency Model for Networks-on-Chip , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.