Exploration of 3D stacked L2 cache design for high performance and efficient thermal control

The three-dimensional (3D) integration enables stacking large memory on top of chip-multi-processors (CMPs). Compared to the 2D case, the extra dimension and high bandwidth provide more options for the design of on-chip memory such as L2 caches. In this work, we study the design of 3D stacked set-associative L2 caches through managing the placement of cache ways. The evaluation results show that the placement has an impact on the performance. In addition, we propose a technique of shadow tag to dynamically adjust the working size of the 3D cache in order to save power and reduce the peak temperature. Evaluation results show that the proposed inter-layer core-based-distribution placement of 3D cache ways is the best design option, when both the performance and thermal management are considered.

[1]  Norman P. Jouppi,et al.  Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[2]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[3]  Yuan Xie,et al.  Design space exploration for 3D architectures , 2006, JETC.

[4]  Kaustav Banerjee,et al.  A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[5]  Doug Burger,et al.  An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.

[6]  James R. Goodman,et al.  Limited bandwidth to affect processor design , 1997, IEEE Micro.

[7]  T. Mudge,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.

[8]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[9]  C. Nicopoulos,et al.  Design and Management of 3D Chip Multiprocessors Using Network-in-Memory , 2006, ISCA 2006.

[10]  Kunle Olukotun,et al.  Maximizing CMP throughput with mediocre cores , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).