I-cache multi-banking and vertical interleaving

This research investigates the impact of a microarchitectural technique called vertical interleaving in multi-banked caches. Unlike previous multi-banking and interleaving techniques to increase cache bandwidth, the proposed vertical interleaving further divides memory banks in a cache into vertically arranged sub-banks, which are selectively accessed based on the memory address. Under this setting, we are particularly interested in how accesses to instruction cache are dispersed toward different cache banks. We quantitatively analyze the memory access pattern seen by each cache bank and establish the relationship between important cache parameters and the access patterns. Our study shows that the vertical interleaving technique distributes accesses among different banks with tightly bounded run lengths. We then discuss possible applications that utilize the presented concept, including power density reduction. Very simple interleaving configurations can lead to as much as 67% reduction of maximum power density under a realistic machine configuration. Our study suggests that the idea of vertically interleaving cache lines has potential for optimizing memory accesses in a number of interesting ways.

[1]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[2]  Mikko H. Lipasti,et al.  Value locality and load value prediction , 1996, ASPLOS VII.

[3]  Kenneth C. Yeager The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.

[4]  T. Wada,et al.  An analytical access time model for on-chip cache memories , 1992 .

[5]  Tryggve Fossum,et al.  Cache scrubbing in microprocessors: myth or necessity? , 2004, 10th IEEE Pacific Rim International Symposium on Dependable Computing, 2004. Proceedings..

[6]  Alvin M. Despain,et al.  Cache designs for energy efficiency , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[7]  Gurindar S. Sohi,et al.  High-bandwidth data memory systems for superscalar processors , 1991, ASPLOS IV.

[8]  Edward S. Davidson,et al.  Organization of Semiconductor Memories for Parallel-Pipelined Processors , 1977, IEEE Transactions on Computers.

[9]  Avi Mendelson,et al.  Coming challenges in microarchitecture and architecture , 2001, Proc. IEEE.

[10]  Sangyeun Cho,et al.  Decoupling local variable accesses in a wide-issue superscalar processor , 1999, ISCA.

[11]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[12]  Andreas Moshovos,et al.  Streamlining inter-operation memory communication via data dependence prediction , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[13]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .

[14]  Nikil Dutt,et al.  An Enhanced Power Estimation Model for On-Chip Caches , 2004 .

[15]  Margaret Martonosi,et al.  Dynamic thermal management for high-performance microprocessors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[16]  Gregory F. Grohoski,et al.  Machine Organization of the IBM RISC System/6000 Processor , 1990, IBM J. Res. Dev..

[17]  Jie S. Hu,et al.  Optimizing the thermal behavior of subarrayed data caches , 2005, 2005 International Conference on Computer Design.

[18]  Gary S. Tyson,et al.  On high-bandwidth data cache design for multi-issue processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[19]  Yehea I. Ismail,et al.  Thermal Management of On-Chip Caches Through Power Density Minimization , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[20]  Sachin S. Sapatnekar,et al.  Impact of NBTI on SRAM read stability and design for reliability , 2006, 7th International Symposium on Quality Electronic Design (ISQED'06).