TA-LRW: A Replacement Policy for Error Rate Reduction in STT-MRAM Caches

As technology process node scales down, on-chip SRAM caches lose their efficiency because of their low scalability, high leakage power, and increasing rate of soft errors. Among emerging memory technologies, <italic><inline-formula><tex-math notation="LaTeX">$Spin$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq1-2875439.gif"/></alternatives></inline-formula></italic>-<italic><inline-formula><tex-math notation="LaTeX">$Transfer\; Torque\; Magnetic\; RAM$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq2-2875439.gif"/></alternatives></inline-formula></italic> (STT-MRAM) is known as the most promising replacement for SRAM-based cache memories. The main advantages of STT-MRAM are its non-volatility, near-zero leakage power, higher density, soft-error immunity, and higher scalability. Despite these advantages, high error rate in STT-MRAM cells due to <italic><inline-formula><tex-math notation="LaTeX">$retention\; failure$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq3-2875439.gif"/></alternatives></inline-formula></italic>, <italic><inline-formula><tex-math notation="LaTeX">$write\; failure$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq4-2875439.gif"/></alternatives></inline-formula></italic>, and <italic><inline-formula><tex-math notation="LaTeX">$read\; disturbance$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq5-2875439.gif"/></alternatives></inline-formula></italic> threatens the reliability of cache memories built upon STT-MRAM technology. The error rate is significantly increased in higher temperature, which further affects the reliability of STT-MRAM-based cache memories. The major source of heat generation and temperature increase in STT-MRAM cache memories is write operations, which are managed by cache <italic><inline-formula><tex-math notation="LaTeX">$replacement\; policy$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq6-2875439.gif"/></alternatives></inline-formula></italic>. To the best of our knowledge, none of previous studies have attempted to mitigate heat generation and high temperature of STT-MRAM cache memories using replacement policy. In this paper, we first analyze the cache behavior in conventional <italic><inline-formula><tex-math notation="LaTeX">$Least$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq7-2875439.gif"/></alternatives></inline-formula></italic>-<italic><inline-formula><tex-math notation="LaTeX">$Recently\; Used$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq8-2875439.gif"/></alternatives></inline-formula></italic> (LRU) replacement policy and demonstrate that the majority of consecutive write operations (more than 66 percent) are committed to adjacent cache blocks. These adjacent write operations cause accumulated heat and increased temperature, which significantly increase the cache error rate. To eliminate heat accumulation and the adjacency of consecutive writes, we propose a cache replacement policy, named <italic><inline-formula><tex-math notation="LaTeX">$Thermal$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq9-2875439.gif"/></alternatives></inline-formula></italic>-<italic><inline-formula><tex-math notation="LaTeX">$Aware\; Least$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq10-2875439.gif"/></alternatives></inline-formula></italic>-<italic><inline-formula><tex-math notation="LaTeX">$Recently\; Written$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq11-2875439.gif"/></alternatives></inline-formula></italic> (TA-LRW), to smoothly distribute the generated heat by conducting consecutive write operations in distant cache blocks. TA-LRW guarantees the distance of at least three blocks for each two consecutive write operations in an 8-way associative cache. This distant write scheme reduces the temperature-induced error rate by 94.8 percent, on average, compared with the conventional LRU policy, which results in 6.9x reduction in cache error rate. The implementation cost and complexity of TA-LRW is as low as <italic><inline-formula><tex-math notation="LaTeX">$First$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq12-2875439.gif"/></alternatives></inline-formula></italic>-<italic><inline-formula><tex-math notation="LaTeX">$In,\; First$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq13-2875439.gif"/></alternatives></inline-formula></italic>-<italic><inline-formula><tex-math notation="LaTeX">$Out$</tex-math><alternatives><inline-graphic xlink:href="asadi-ieq14-2875439.gif"/></alternatives></inline-formula></italic> (FIFO) policy while providing a near-LRU performance, having the advantages of both replacement policies. The significantly reduced error rate is achieved by imposing only 2.3 percent performance overhead compared with the LRU policy.

[1]  T. Devolder,et al.  Self-Enabled “Error-Free” Switching Circuit for Spin Transfer Torque MRAM and Logic , 2012, IEEE Transactions on Magnetics.

[2]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[3]  Jaehyuk Huh,et al.  Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[4]  Nanning Zheng,et al.  Design techniques to improve the device write margin for MRAM-based cache memory , 2011, GLSVLSI '11.

[5]  Kiyoung Choi,et al.  Exploration of trade-offs in the design of volatile STT-RAM cache , 2016, J. Syst. Archit..

[6]  Xueti Tang,et al.  Spin-transfer torque magnetic random access memory (STT-MRAM) , 2013, JETC.

[7]  Hai Li,et al.  Process variation aware data management for STT-RAM cache design , 2012, ISLPED '12.

[8]  Danghui Wang,et al.  Improving read performance of STT-MRAM based main memories through Smash Read and Flexible Read , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[9]  Patrick Ndai,et al.  Design Paradigm for Robust Spin-Torque Transfer Magnetic RAM (STT MRAM) From Circuit/Architecture Perspective , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[10]  Nanning Zheng,et al.  Architectural Exploration to Enable Sufficient MTJ Device Write Margin for STT-RAM Based Cache , 2012, IEEE Transactions on Magnetics.

[11]  Rami G. Melhem,et al.  CAFO: Cost aware flip optimization for asymmetric memories , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[12]  A. Sellitto,et al.  Flux Limiters in Radial Heat Transport in Silicon Nanolayers , 2014 .

[13]  Tosiron Adegbija,et al.  LARS: Logically adaptable retention time STT-RAM cache for embedded systems , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[14]  Seyed Ghassem Miremadi,et al.  LER: Least-Error-Rate Replacement Algorithm for Emerging STT-RAM Caches , 2016, IEEE Transactions on Device and Materials Reliability.

[15]  Mahmood Fathy,et al.  Energy aware and reliable STT-RAM based cache design for 3D embedded chip-multiprocessors , 2017, 2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC).

[16]  Kaushik Roy,et al.  Future cache design using STT MRAMs for improved energy efficiency: Devices, circuits and architecture , 2012, DAC Design Automation Conference 2012.

[17]  Jeffrey S. Vetter,et al.  A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems , 2016, IEEE Transactions on Parallel and Distributed Systems.

[18]  Yiran Chen,et al.  Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[19]  Paolo Prinetto,et al.  Challenges and Solutions in Emerging Memory Testing , 2019, IEEE Transactions on Emerging Topics in Computing.

[20]  Engin Ipek,et al.  Sanitizer: Mitigating the Impact of Expensive ECC Checks on STT-MRAM Based Main Memories , 2018, IEEE Transactions on Computers.

[21]  Kiyoung Choi,et al.  Selectively protecting error-correcting code for area-efficient and reliable STT-RAM caches , 2013, 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC).

[22]  Seyed Ghassem Miremadi,et al.  An Efficient Protection Technique for Last Level STT-RAM Caches in Multi-Core Processors , 2017, IEEE Transactions on Parallel and Distributed Systems.

[23]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[24]  Mehdi Baradaran Tahoori,et al.  Asynchronous Asymmetrical Write Termination (AAWT) for a low power STT-MRAM , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[25]  Jae-Joon Kim,et al.  Analysis and Optimization of Thermal Effect on STT-RAM Based 3-D Stacked Cache Design , 2012, 2012 IEEE Computer Society Annual Symposium on VLSI.

[26]  Seyed Ghassem Miremadi,et al.  Investigating the Effects of Process Variations and System Workloads on Reliability of STT-RAM Caches , 2016, 2016 12th European Dependable Computing Conference (EDCC).

[27]  Wenqing Wu,et al.  Probabilistic design methodology to improve run-time stability and performance of STT-RAM caches , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[28]  Babak Falsafi,et al.  Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[29]  Arijit Raychowdhury,et al.  Analysis of Defects and Variations in Embedded Spin Transfer Torque (STT) MRAM Arrays , 2016, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[30]  Yiran Chen,et al.  CD-ECC: Content-dependent error correction codes for combating asymmetric nonvolatile memory operation errors , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[31]  An Chen,et al.  A review of emerging non-volatile memory (NVM) technologies and applications , 2016 .

[32]  Rami G. Melhem,et al.  Leveraging ECC to Mitigate Read Disturbance, False Reads and Write Faults in STT-RAM , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[33]  Yiran Chen,et al.  A Novel Self-Reference Technique for STT-RAM Read and Write Reliability Enhancement , 2014, IEEE Transactions on Magnetics.

[34]  Gi-Ho Park,et al.  NVM Way Allocation Scheme to Reduce NVM Writes for Hybrid Cache Architecture in Chip-Multiprocessors , 2017, IEEE Transactions on Parallel and Distributed Systems.

[35]  Kaushik Roy,et al.  Yield, Area, and Energy Optimization in STT-MRAMs Using Failure-Aware ECC , 2015, ACM J. Emerg. Technol. Comput. Syst..

[36]  Paul Ampadu,et al.  Reliable Ultra-Low-Voltage Cache Design for Many-Core Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[37]  Chita R. Das,et al.  Cache revive: Architecting volatile STT-RAM caches for enhanced performance in CMPs , 2012, DAC Design Automation Conference 2012.

[38]  Gi-Ho Park,et al.  Performance and energy-efficiency analysis of hybrid cache memory based on SRAM-MRAM , 2012, 2012 International SoC Design Conference (ISOCC).

[39]  Youguang Zhang,et al.  Read disturbance issue and design techniques for nanoscale STT-MRAM , 2016, J. Syst. Archit..

[40]  Seyed Ghassem Miremadi,et al.  AWARE: Adaptive Way Allocation for Reconfigurable ECCs to Protect Write Errors in STT-RAM Caches , 2019, IEEE Transactions on Emerging Topics in Computing.

[41]  Arijit Raychowdhury,et al.  A Model Study of Defects and Faults in Embedded Spin Transfer Torque (STT) MRAM Arrays , 2015, 2015 IEEE 24th Asian Test Symposium (ATS).

[42]  Chaitali Chakrabarti,et al.  Enhancing the Reliability of STT-RAM through Circuit and System Level Techniques , 2012, 2012 IEEE Workshop on Signal Processing Systems.

[43]  Jun Wang,et al.  Energy-Aware Adaptive Restore Schemes for MLC STT-RAM Cache , 2017, IEEE Transactions on Computers.

[44]  S. Vijayalakshmi,et al.  Highly efficient LRU implementations for high associativity cache memory , 2004 .

[45]  Soontae Kim,et al.  Ternary cache: Three-valued MLC STT-RAM caches , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[46]  Jun Yang,et al.  Selective restore: An energy efficient read disturbance mitigation scheme for future STT-MRAM , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[47]  Seong-Ook Jung,et al.  Read Disturbance Reduction Technique for Offset-Canceling Dual-Stage Sensing Circuits in Deep Submicrometer STT-RAM , 2016, IEEE Transactions on Circuits and Systems II: Express Briefs.

[48]  Yuan Xie,et al.  OAP: An obstruction-aware cache management policy for STT-RAM last-level caches , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[49]  Zahra Azad,et al.  ORIENT: Organized interleaved ECCs for new STT-MRAM caches , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[50]  Gokhan Memik,et al.  TESLA: Using microfluidics to thermally stabilize 3D stacked STT-RAM caches , 2016, 2016 IEEE 34th International Conference on Computer Design (ICCD).

[51]  Yiran Chen,et al.  A novel architecture of the 3D stacked MRAM L2 cache for CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[52]  Mohsen Imani,et al.  Approximate Computing Using Multiple-Access Single-Charge Associative Memory , 2018, IEEE Transactions on Emerging Topics in Computing.

[53]  Seyed Ghassem Miremadi,et al.  Floating-ECC: Dynamic Repositioning of Error Correcting Code Bits for Extending the Lifetime of STT-RAM Caches , 2016, IEEE Transactions on Computers.

[54]  Yiran Chen,et al.  Asymmetry of MTJ switching and its implication to STT-RAM designs , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[55]  H. Ohno,et al.  Highly-scalable disruptive reading scheme for Gb-scale SPRAM and beyond , 2010, 2010 IEEE International Memory Workshop.

[56]  Mohamad Towfik Krounbi,et al.  Basic principles of STT-MRAM cell operation in memory arrays , 2013 .

[57]  Bernard Dieny,et al.  Introduction to Magnetic Random-Access Memory , 2016 .

[58]  Mircea R. Stan,et al.  Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[59]  Jacques-Olivier Klein,et al.  Failure and reliability analysis of STT-MRAM , 2012, Microelectron. Reliab..

[60]  Sparsh Mittal A Survey of Soft-Error Mitigation Techniques for Non-Volatile Memories , 2017, Comput..

[61]  裕幸 飯田,et al.  International Technology Roadmap for Semiconductors 2003の要求清浄度について - シリコンウエハ表面と雰囲気環境に要求される清浄度, 分析方法の現状について - , 2004 .

[62]  Nahid Farhady Ghalaty,et al.  A Cache-Assisted Scratchpad Memory for Multiple-Bit-Error Correction , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[63]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[64]  Michel Dubois,et al.  CPPC: Correctable parity protected cache , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[65]  Wenqing Wu,et al.  Multi retention level STT-RAM cache designs with a dynamic refresh scheme , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[66]  Hai Helen Li,et al.  STT-RAM Cell Design Considering CMOS and MTJ Temperature Dependence , 2012, IEEE Transactions on Magnetics.

[67]  Jacques-Olivier Klein,et al.  Design considerations and strategies for high-reliable STT-MRAM , 2011, Microelectron. Reliab..

[68]  Jun Yang,et al.  Constructing large and fast multi-level cell STT-MRAM based cache for embedded processors , 2012, DAC Design Automation Conference 2012.

[69]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[70]  Seyed Ghassem Miremadi,et al.  PSP-Cache: A low-cost fault-tolerant cache memory architecture , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).