WOM-Code Solutions for Low Latency and High Endurance in Phase Change Memory

This paper describes a write-once-memory-code phase change memory (WOM-code PCM) architecture for next-generation non-volatile memory applications. Specifically, we address the long latency of the write operation in PCM-attributed to PCM SET-by proposing a novel PCM memory architecture that integrates the 〈22}2/3 WOM-code at the memory organization and memory controller levels. To further improve the write latency of WOM-code PCM, we propose a PCM-refresh approach that uses idle cycles to preemptively set PCM rows to the initial WOM-code state. Finally, to balance write latency improvements against WOM-code PCM overhead, we propose a WOM-code cached PCM (WCPCM) architecture that uses WOM-code PCM as the cache alongside conventional PCM main memory. Since WOM-code techniques inherently impact PCM endurance by increasing the number of bitwrites in comparison to unencoded PCM, we incorporate additional transitions from the 〈22}2/3 WOM-code transition graph to realize endurance-WOM-code (e-WOM-code) architectures. Transitions between the e-WOM-code states on writes to memory are integrated into an incremental coding for endurance (ICE) approach that exploits redundancies in the conventional WOM-code to reduce the number of bit-writes over unencoded PCM. Simulation results show that the proposed e-WOM-code PCM architecture is able to reduce memory write (read) latency by 19.8 percent (14.7 percent) and the number of bit-writes over unencoded PCM without (with) datacomparison write (DCW), a read-modify-write process that only updates changed cells, by 83.0 percent (22.1 percent) on average across general-purpose (SPEC CPU2006), embedded (MiBench), and high-performance (SPLASH-2) benchmarks. Further, e-WOMcode PCM with PCM-refresh can reduce memory write (read) latency by 51.5 percent (44.1 percent) and the number of bit-writes over unencoded PCM without DCW by 76.5 percent on average across the benchmarks; there is, however, an increase of 19 percent in the number of bit-writes over unencoded PCM with DCW. Finally, for just 4.7 percent memory overhead, the e-WOM-code cached PCM (e-WCPCM) architecture reduces memory write (read) latency by 47.5 percent (41.6 percent) and the number of bit-writes over unencoded PCM without DCW by 68.1 percent on average across the benchmarks; again, there is a 49 percent increase in the number of bit-writes over unencoded PCM with DCW.

[1]  M. Ekman,et al.  A robust main-memory compression scheme , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[2]  Steven Swanson,et al.  Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications , 2009, ASPLOS.

[3]  Adi Shamir,et al.  How to Reuse a "Write-Once" Memory , 1982, Inf. Control..

[4]  Jun Yang,et al.  Improving write operations in MLC phase change memory , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[5]  Jason Cong,et al.  Static and dynamic co-optimizations for blocks mapping in hybrid caches , 2012, ISLPED '12.

[6]  Anxiao Jiang,et al.  Rank modulation for flash memories , 2008, 2008 IEEE International Symposium on Information Theory.

[7]  Yifeng Zhu,et al.  Accelerating write by exploiting PCM asymmetries , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[8]  Yiran Chen,et al.  Nonvolatile Memory Design: Magnetic, Resistive, and Phase Change , 2011 .

[9]  Chih-Wei Chen,et al.  Performances of GeSnSbTe Material for High-Speed Phase Change Memory , 2007, 2007 International Symposium on VLSI Technology, Systems and Applications (VLSI-TSA).

[10]  Zhao Zhang,et al.  Design and optimization of large size and low overhead off-chip caches , 2004, IEEE Transactions on Computers.

[11]  Miodrag Potkonjak,et al.  Coding-based energy minimization for Phase Change Memory , 2012, DAC Design Automation Conference 2012.

[12]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  Jiayin Li,et al.  Write-once-memory-code phase change memory , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[14]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[15]  Amin Jadidi,et al.  High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[16]  Nihar R. Mahapatra,et al.  A Limit Study on the Potential of Compression for Improving Memory System Performance, Power Consumption, and Cost , 2005, J. Instr. Level Parallelism.

[17]  Xi Chen,et al.  C-Pack: A High-Performance Microprocessor Cache Compression Algorithm , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[18]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[19]  Sunggu Lee,et al.  Power management of hybrid DRAM/PRAM-based main memory , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[20]  Ricardo Bianchini,et al.  Page placement in hybrid memory systems , 2011, ICS '11.

[21]  Bruce Jacob,et al.  DRAMSim2: A Cycle Accurate Memory System Simulator , 2011, IEEE Computer Architecture Letters.

[22]  Hyunjin Lee,et al.  Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[23]  Thomas M. Conte,et al.  Energy efficient Phase Change Memory based main memory for future high performance systems , 2011, 2011 International Green Computing Conference and Workshops.

[24]  N. Muralimanohar,et al.  CACTI 6 . 0 : A Tool to Understand Large Caches , 2007 .

[25]  Yifeng Zhu,et al.  Making Write Less Blocking for Read Accesses in Phase Change Memory , 2012, 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[26]  Sunggu Lee,et al.  Optimizing Video Application Design for Phase-Change RAM-Based Main Memory , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27]  Sunggu Lee,et al.  Write performance improvement by hiding R drift latency in phase-change RAM , 2012, DAC Design Automation Conference 2012.

[28]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[29]  Yuan Xie Future memory and interconnect technologies , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[30]  Qi Wang,et al.  A 20nm 1.8V 8Gb PRAM with 40MB/s program bandwidth , 2012, 2012 IEEE International Solid-State Circuits Conference.

[31]  Sunggu Lee,et al.  Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU , 2012, DAC Design Automation Conference 2012.

[32]  Lei Yang,et al.  High-performance operating system controlled memory compression , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[33]  David Wentzlaff,et al.  OpenPiton: An Open Source Manycore Research Framework , 2016, ASPLOS.

[34]  Luis A. Lastras,et al.  PreSET: Improving performance of phase change memories by exploiting asymmetry in write times , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[35]  Yuan Xie,et al.  Energy-efficient multi-level cell phase-change memory system with data encoding , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).

[36]  Sung-Min Yoon,et al.  Low power and high speed phase-change memory devices with silicon-germanium heating layers , 2007 .

[37]  Vijay Janapa Reddi,et al.  PIN: a binary instrumentation tool for computer architecture research and education , 2004, WCAE '04.

[38]  Anxiao Jiang,et al.  Correcting Charge-Constrained Errors in the Rank-Modulation Scheme , 2010, IEEE Transactions on Information Theory.

[39]  Xiaowei Li,et al.  Wear rate leveling: Lifetime enhancement of PRAM with endurance variation , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[40]  Byung-Gil Choi,et al.  A 0.1-$\mu{\hbox {m}}$ 1.8-V 256-Mb Phase-Change Random Access Memory (PRAM) With 66-MHz Synchronous Burst-Read Operation , 2007, IEEE Journal of Solid-State Circuits.

[41]  Moinuddin K. Qureshi,et al.  Improving read performance of Phase Change Memories via Write Cancellation and Write Pausing , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[42]  Yuan Xie,et al.  A frequent-value based PRAM memory architecture , 2011, 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011).

[43]  Anxiao Jiang,et al.  Constrained codes for phase-change memories , 2010, 2010 IEEE Information Theory Workshop.

[44]  Jongman Kim,et al.  A Compression-Based Hybrid MLC/SLC Management Technique for Phase-Change Memory Systems , 2012, 2012 IEEE Computer Society Annual Symposium on VLSI.

[45]  Koen De Bosschere,et al.  2FAR: A 2bcgskew Predictor Fused by an Alloyed Redundant History Skewed Perceptron Branch Predictor , 2005, J. Instr. Level Parallelism.

[46]  Gang Wu,et al.  CAR: Securing PCM Main Memory System with Cache Address Remapping , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[47]  Tajana Simunic,et al.  PDRAM: A hybrid PRAM and DRAM main memory system , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[48]  Seung-Yun Lee,et al.  A Low Power Phase-Change Random Access Memory using a Data-Comparison Write Scheme , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[49]  Paul H. Siegel,et al.  Time-Space Constrained Codes for Phase-Change Memories , 2013, IEEE Trans. Inf. Theory.

[50]  Yihong Wu,et al.  Fast phase transitions induced by picosecond electrical pulses on phase change memory cells , 2008 .

[51]  Jiayin Li,et al.  Compression architecture for bit-write reduction in non-volatile memory technologies , 2014, 2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH).

[52]  Hsien-Hsin S. Lee,et al.  Security refresh: prevent malicious wear-out and increase durability for phase-change memory with dynamically randomized address mapping , 2010, ISCA.

[53]  Wei Xu,et al.  A Time-Aware Fault Tolerance Scheme to Improve Reliability of Multilevel Phase-Change Memory in the Presence of Significant Resistance Drift , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.