Low power data-aware STT-RAM based hybrid cache architecture

Static Random Access Memories (SRAMs) occupy a large area of today's microprocessors, and are a prime source of leakage power in highly scaled technologies. Low leakage and high density Spin-Transfer Torque RAMs (STT-RAMs) are ideal candidates for a power-efficient memory. However, STT-RAM suffers from high write energy and latency, especially when writing `one' data. In this paper we propose a novel data-aware hybrid STT-RAM/SRAM cache architecture which stores data in the two partitions based on their bit counts. To exploit the new resultant data distribution in the SRAM partition, we employ an asymmetric low-power 5T-SRAM structure which has high reliability for majority `one' data. The proposed design significantly reduces the number of writes and hence dynamic energy in both STT-RAM and SRAM partitions. We employed a write cache policy and a small swap memory to control data migration between cache partitions. Our evaluation on UltraSPARC-III processor shows that utilizing STT-RAM/6T-SRAM and STT-RAM/5T-SRAM architectures for the L2 cache results in 42% and 53% energy efficiency, 9.3% and 9.1% performance improvement and 16.9% and 20.3% area efficiency respectively, with respect to SRAM-based cache running SPEC CPU 2006 benchmarks.

[1]  Tajana Simunic,et al.  Resistive configurable associative memory for approximate computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Amin Jadidi,et al.  High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[3]  Tajana Simunic,et al.  MASC: Ultra-low energy multiple-access single-charge TCAM for approximate computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[4]  Tajana Simunic,et al.  CAUSE: Critical application usage-aware memory system using non-volatile memory for mobile devices , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[5]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[6]  Xiaoxia Wu,et al.  Power and performance of read-write aware Hybrid Caches with non-volatile memories , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[7]  M. Nelhiebel,et al.  The Paradigm Shift in Understanding the Bias Temperature Instability: From Reaction–Diffusion to Switching Oxide Traps , 2011, IEEE Transactions on Electron Devices.

[8]  Mircea R. Stan,et al.  Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[9]  Mircea R. Stan,et al.  Delivering on the promise of universal memory for spin-transfer torque RAM (STT-RAM) , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[10]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[11]  Margaret Martonosi,et al.  Dynamically exploiting narrow width operands to improve processor power and performance , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[12]  Jeffrey S. Vetter,et al.  AYUSH: A Technique for Extending Lifetime of SRAM-NVM Hybrid Caches , 2015, IEEE Computer Architecture Letters.

[13]  Yiran Chen,et al.  Emerging non-volatile memories: Opportunities and challenges , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[14]  Shuai Wang,et al.  Exploiting narrow-width values for improving non-volatile cache lifetime , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  Mohsen Imani,et al.  Analysis of power gating in different hierarchical levels of 2MB cache, considering variation , 2015 .

[16]  Jan M. Rabaey,et al.  SRAM leakage suppression by minimizing standby supply voltage , 2004, International Symposium on Signals, Circuits and Systems. Proceedings, SCS 2003. (Cat. No.03EX720).

[17]  R.V. Joshi,et al.  The Impact of Aging Effects and Manufacturing Variation on SRAM Soft-Error Rate , 2008, IEEE Transactions on Device and Materials Reliability.

[18]  Mehdi Baradaran Tahoori,et al.  Architectural aspects in design and analysis of SOT-based memories , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[19]  Chita R. Das,et al.  Cache revive: Architecting volatile STT-RAM caches for enhanced performance in CMPs , 2012, DAC Design Automation Conference 2012.

[20]  Alexander Fish,et al.  A 40-nm Sub-Threshold 5T SRAM Bit Cell With Improved Read and Write Stability , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[21]  Jianhua Li,et al.  STT-RAM based energy-efficiency hybrid cache for CMPs , 2011, 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip.

[22]  Jun Yang,et al.  Energy reduction for STT-RAM using early write termination , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[23]  Tajana Simunic,et al.  Hierarchical design of robust and low data dependent FinFET based SRAM array , 2015, Proceedings of the 2015 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH´15).

[24]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[25]  Weng-Fai Wong,et al.  A coherent hybrid SRAM and STT-RAM L1 cache architecture for shared memory multicores , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[26]  Yiran Chen,et al.  Processor caches built using multi-level spin-transfer torque RAM cells , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[27]  Wenqing Wu,et al.  Multi retention level STT-RAM cache designs with a dynamic refresh scheme , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[28]  Sooyong Kang,et al.  Performance Trade-Offs in Using NVRAM Write Buffer for Flash Memory-Based Storage Devices , 2009, IEEE Transactions on Computers.