Periodic learning-based region selection for energy-efficient MLC STT-RAM cache

The emerging multi-level cell (MLC) spin-transfer torque RAM (STT-RAM) is becoming one of the most promising candidates to replace SRAM as on-chip last-level caches. Compared with single-level cell (SLC) STT-RAM design, MLC cache outperforms SLC cache in terms of storage capacity. However, due to the cell design constrains, MLC STT-RAM suffers from considerably long write latency and high write energy. To explore the potential benefits of MLC STT-RAM cache, this paper proposes a scheme named periodic learning-based region selection (PLRS). We first formulate the region selection problem with greedy algorithm and then profile and collect the cache access behavior through periodic learning. Finally, PLRS will determine region selection based on the behavior information. The experimental results show that PLRS reduces dynamic energy consumption by 22.7% and reduces execution time by 16.2% on average compared to conventional MLC STT-RAM, with negligible overhead.

[1]  Yiran Chen,et al.  Multi-level cell STT-RAM: Is it realistic or just a dream? , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[2]  Tosiron Adegbija,et al.  LARS: Logically adaptable retention time STT-RAM cache for embedded systems , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[3]  Amin Jadidi,et al.  Performance and Power-Efficient Design of Dense Non-Volatile Cache in CMPs , 2018, IEEE Transactions on Computers.

[4]  Jaeyoung Park,et al.  Two-Phase Read Strategy for Low Energy Variation-Tolerant STT-RAM , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Liang Shi,et al.  Two-step state transition minimization for lifetime and performance improvement on MLC STT-RAM , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Jun Wang,et al.  AOS: Adaptive overwrite scheme for energy-efficient MLC STT-RAM cache , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[7]  Yiran Chen,et al.  On-chip caches built on multilevel spin-transfer torque RAM cells and its optimizations , 2013, JETC.

[8]  Weng-Fai Wong,et al.  Optimizing MLC-based STT-RAM caches by dynamic block size reconfiguration , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[9]  Yiran Chen,et al.  State-restrict MLC STT-RAM designs for high-reliable high-performance memory system , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Mahmood Fathy,et al.  An energy-efficient 3D-stacked STT-RAM cache architecture for cloud processors: the effect on emerging scale-out workloads , 2017, The Journal of Supercomputing.

[11]  Norman P. Jouppi,et al.  Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[12]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[13]  Danghui Wang,et al.  Unleashing the potential of MLC STT-RAM caches , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[14]  Jun Yang,et al.  Constructing large and fast multi-level cell STT-MRAM based cache for embedded processors , 2012, DAC Design Automation Conference 2012.

[15]  Yuan Xie,et al.  Access scheme of Multi-Level Cell Spin-Transfer Torque Random Access Memory and its optimization , 2010, 2010 53rd IEEE International Midwest Symposium on Circuits and Systems.

[16]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Sergio Bampi,et al.  Approximation-aware Multi-Level Cells STT-RAM cache architecture , 2015, 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).

[18]  Samira Manabi Khan,et al.  Sampling Dead Block Prediction for Last-Level Caches , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[19]  Yiran Chen,et al.  Processor caches built using multi-level spin-transfer torque RAM cells , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[20]  Leibo Liu,et al.  Bit-Level Disturbance-Aware Memory Partitioning for Parallel Data Access for MLC STT-RAM , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[21]  Gu-Yeon Wei,et al.  Process Variation Tolerant 3T1D-Based Cache Architectures , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[22]  Soontae Kim,et al.  Ternary cache: Three-valued MLC STT-RAM caches , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[23]  Ning Ge,et al.  TriZone: A Design of MLC STT-RAM Cache for Combined Performance, Energy, and Reliability Optimizations , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Mehdi Baradaran Tahoori,et al.  A cross-layer adaptive approach for performance and power optimization in STT-MRAM , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[25]  Jeronimo Castrillon,et al.  Performance and Energy-Efficient Design of STT-RAM Last-Level Cache , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[26]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.