Characterizing the impact of process variation on write endurance enhancing techniques for non-volatile memory systems

Much attention has been given recently to a set of promising non-volatile memory technologies, such as PCM, STT-MRAM, and ReRAM. These, however, have limited endurance relative to DRAM. Potential solutions to this endurance challenge exist in the form of fine-grain wear leveling techniques and aggressive error tolerance approaches. While the existing approaches to wear leveling and error tolerance are sound and demonstrate true potential, their studies have been limited in that i) they have not considered the interactions between wear leveling and error tolerance and ii) they have assumed a simple write endurance failure model where all cells fail uniformly. In this paper we perform a thorough study and characterize such interactions and the effects of more realistic non-uniform endurance models under various workloads, both synthetic and derived from benchmarks. This study shows that, for instance, variability in the endurance of cells significantly affects wear leveling and error tolerance mechanisms and the values of their tuning parameters. It also shows that these mechanisms interact in subtle ways, sometimes cancelling and sometimes boosting each other's impact on overall endurance of the device.

[1]  Tao Li,et al.  Characterizing and mitigating the impact of process variations on phase change based memory systems , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Engin Ipek,et al.  Dynamically replicated memory: building reliable systems from nanoscale resistive memories , 2010, ASPLOS XV.

[3]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[4]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[5]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  Shih-Hung Chen,et al.  Phase-change random access memory: A scalable technology , 2008, IBM J. Res. Dev..

[7]  Douglas C. Montgomery,et al.  Statistical quality control : a modern introduction , 2009 .

[8]  B. Jacob,et al.  CMP $ im : A Pin-Based OnThe-Fly Multi-Core Cache Simulator , 2008 .

[9]  Hsien-Hsin S. Lee,et al.  Security refresh: prevent malicious wear-out and increase durability for phase-change memory with dynamically randomized address mapping , 2010, ISCA.

[10]  J. Kim,et al.  Full Integration of Highly Manufacturable 512Mb PRAM based on 90nm Technology , 2006, 2006 International Electron Devices Meeting.

[11]  Hyunjin Lee,et al.  Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Norman P. Jouppi,et al.  FREE-p: Protecting non-volatile memory against both hard and soft errors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[13]  Hsien-Hsin S. Lee,et al.  SAFER: Stuck-At-Fault Error Recovery for Memories , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[14]  D. Ielmini,et al.  Reliability study of phase-change nonvolatile memories , 2004, IEEE Transactions on Device and Materials Reliability.

[15]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[16]  Karin Strauss,et al.  Use ECP, not ECC, for hard failures in resistive memories , 2010, ISCA.

[17]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[18]  Eduardo Pinheiro,et al.  DRAM errors in the wild: a large-scale field study , 2009, SIGMETRICS '09.