A 32kB 2R/1W L1 data cache in 45nm SOI technology for the POWER7TM processor

Increasing demand for parallelism due to out-of-order and multi-threading computation requires fast and dense arrays with multi-port capabilities. The load-store-unit (LSU) of the POWER7™ microprocessor core has a 32kB L1 data cache composed of four 8kB blocks. In a two-cycle back-to-back operation it supports concurrently two independent read and one write operations. Organized in banks of 16 cells each, the two reads operate independently in any of these banks, including two reads within the same bank, even the same cell. A bank selected for write is blocked for any read operation. If read and write collide within the same bank, collision-control circuitry provides write-over-read priority. Each read port provides 4B from 1 of 256 locations, whereas the double-bandwidth write operation provides individual control of 8B to 128 locations.

[1]  V. De,et al.  The scaling of data sensing schemes for high speed cache design in sub-0.18 /spl mu/m technologies , 2000, 2000 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.00CH37103).

[2]  D. Plass,et al.  A 5.6GHz 64kB Dual-Read Data Cache for the POWER6TM Processor , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[3]  Osamu Takahashi,et al.  Implementation of the CELL Broadband Engine in a 65nm SOI Technology Featuring Dual-Supply SRAM Arrays Supporting 6GHz at 1.3V , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.