A 65nm level-1 cache for mobile applications

We describe L1 cache designed for QUALCOMM®'s latest-generation digital signal processor (DSP) core. The cache is 32KB with variable associativity (4 to 16 ways) and is pseudo-dual-ported. Dual access is achieved by banking the cache in a way that minimizes bank conflict to less than 1%. The cache operates at 600 MHZ under worst-case PVT conditions and dissipates 100.8 pJoule per access at 1.2V. A low-leakage multi-threshold-voltage (MTV) 65nm foundry process technology is used for fabrication.. The cache supports simultaneous dual double-word access , and four-double-word evict and fill operations. The memory system includes a tag array and data array: both are designed using QUALCOMM®'s defined single-ported 6T SRAM cell, with an area of 0.54 mm2 and leakage per cell of less than 10 pA. Three threshold voltages are used with foot and head switches to trade off leakage, active power, and performance. The design of the tag and data array uses novel circuit approaches to enable high coverage on testability through data bypassing with minimum impact to speed. It also employs self-timed circuit with process-dependent sense-amp tracking for high speed and low power.

[1]  C.C. Chen,et al.  65nm CMOS high speed, general purpose and low power transistor technology for high volume foundry application , 2004, Digest of Technical Papers. 2004 Symposium on VLSI Technology, 2004..

[2]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[3]  Alexander V. Veidenbaum,et al.  Low energy, highly-associative cache design for embedded processors , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[4]  Richard T. Witek,et al.  A 160 MHz 32 b 0.5 W CMOS RISC microprocessor , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[5]  M. Sherony,et al.  65nm cmos technology for low power applications , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..

[6]  Jacob A. Abraham,et al.  Cache Organization for Embeded Processors: CAM-vs-SRAM , 2006, 2006 IEEE International SOC Conference.

[7]  Lawrence T. Clark,et al.  An embedded 32-b microprocessor core for low-power and high-performance applications , 2001 .

[8]  K. Pagiamtzis,et al.  Content-addressable memory (CAM) circuits and architectures: a tutorial and survey , 2006, IEEE Journal of Solid-State Circuits.

[9]  Iris Bahar,et al.  Power and Performance Tradeoffs using Various Cache Configurations , 2007 .

[10]  L.T. Clark,et al.  A low-power 2.5-GHz 90-nm level 1 cache and memory management unit , 2005, IEEE Journal of Solid-State Circuits.