Overcoming the challenges of crossbar resistive memory architectures

The scalability of DRAM faces challenges from increasing power consumption and the difficulty of building high aspect ratio capacitors. Consequently, emerging memory technologies including Phase Change Memory (PCM), Spin-Transfer Torque RAM (STT-RAM), and Resistive RAM (ReRAM) are being actively pursued as replacements for DRAM memory. Among these candidates, ReRAM has superior characteristics such as high density, low write energy, and high endurance, making it a very attractive cost-efficient alternative to DRAM. In this paper, we present a comprehensive study of ReRAM-based memory systems. ReRAM's high density comes from its unique crossbar architecture where some peripheral circuits are laid below multiple layers of ReRAM cells. A crossbar architecture introduces special constraints on operating voltages, write latency, and array size. The access latency of a crossbar is a function of the data patterns involved in a write operation. These combined with ReRAM's exponential relationship between its write voltage and switching latency provide opportunities for architectural optimizations. This paper makes several key contributions. First, we study the crossbar architecture and describe trade-offs involving voltage drop, write latency, and data pattern. We then analyze microarchitectural enhancements such as double-sided ground biasing and multiphase reset operations to improve write performance. At the architecture level, a simple compression based data encoding scheme is proposed to further bring down the latency. As the compressibility of a block varies based on its content, write latency is not uniform across blocks. To mitigate the impact of slow writes on performance, we propose and evaluate a novel scheduling policy that makes writing decisions based on latency and activity of a bank. The experimental results show that our architecture improves the performance of a system using ReRAM-based main memory by about 44% over a conservative baseline and 14% over an aggressive baseline on average, and has less than 10% performance degradation compared to an ideal DRAM-only system.

[1]  Shimeng Yu,et al.  Metal–Oxide RRAM , 2012, Proceedings of the IEEE.

[2]  Yiran Chen,et al.  A novel architecture of the 3D stacked MRAM L2 cache for CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[3]  Jun Yang,et al.  Improving write operations in MLC phase change memory , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[4]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[5]  Hsien-Hsin S. Lee,et al.  Security refresh: prevent malicious wear-out and increase durability for phase-change memory with dynamically randomized address mapping , 2010, ISCA.

[6]  Rami G. Melhem,et al.  Bit mapping for balanced PCM cell programming , 2013, ISCA.

[7]  Cong Xu,et al.  Adaptive placement and migration policy for an STT-RAM-based hybrid cache , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[8]  Christopher Frost,et al.  Better I/O through byte-addressable, persistent memory , 2009, SOSP '09.

[9]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[10]  Luis A. Lastras,et al.  PreSET: Improving performance of phase change memories by exploiting asymmetry in write times , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[11]  Richard Veras,et al.  RAIDR: Retention-aware intelligent DRAM refresh , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[12]  Duane Mills,et al.  19.7 A 16Gb ReRAM with 200MB/s write and 1GB/s read in 27nm technology , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[13]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[14]  K. Gopalakrishnan,et al.  Phase change memory technology , 2010, 1001.1164.

[15]  Matthew Poremba,et al.  NVMain: An Architectural-Level Main Memory Simulator for Emerging Non-volatile Memories , 2012, 2012 IEEE Computer Society Annual Symposium on VLSI.

[16]  K. Gopalakrishnan,et al.  Large-scale (512kbit) integration of multilayer-ready access-devices based on mixed-ionic-electronic-conduction (MIEC) at 100% yield , 2012, 2012 Symposium on VLSI Technology (VLSIT).

[17]  Norman P. Jouppi,et al.  FREE-p: Protecting non-volatile memory against both hard and soft errors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[18]  Hsien-Hsin S. Lee,et al.  SAFER: Stuck-At-Fault Error Recovery for Memories , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[19]  Wenqing Wu,et al.  Multi retention level STT-RAM cache designs with a dynamic refresh scheme , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[20]  Cong Xu,et al.  Design trade-offs for high density cross-point resistive memory , 2012, ISLPED '12.

[21]  Onur Mutlu,et al.  Linearly compressed pages: A main memory compression framework with low complexity and low latency , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[22]  Shoji Sakamoto,et al.  An 8Mb multi-layered cross-point ReRAM macro with 443MB/s write throughput , 2012, 2012 IEEE International Solid-State Circuits Conference.

[23]  Hsien-Hsin S. Lee,et al.  Tri-level-cell phase change memory: toward an efficient and reliable memory system , 2013, ISCA.

[24]  Moinuddin K. Qureshi,et al.  Improving read performance of Phase Change Memories via Write Cancellation and Write Pausing , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[25]  Cong Xu,et al.  Design implications of memristor-based RRAM cross-point structures , 2011, 2011 Design, Automation & Test in Europe.

[26]  Yifeng Zhu,et al.  Accelerating write by exploiting PCM asymmetries , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[27]  John L. Henning Performance counters and development of SPEC CPU2006 , 2007, CARN.

[28]  Kinam Kim,et al.  A fast, high-endurance and scalable non-volatile memory device made from asymmetric Ta2O(5-x)/TaO(2-x) bilayer structures. , 2011, Nature materials.

[29]  Qi Wang,et al.  A 20nm 1.8V 8Gb PRAM with 40MB/s program bandwidth , 2012, 2012 IEEE International Solid-State Circuits Conference.

[30]  Moinuddin K. Qureshi Pay-As-You-Go: Low-overhead hard-error correction for phase change memories , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[31]  Karin Strauss,et al.  Preventing PCM banks from seizing too much power , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[32]  Y. S. Kim,et al.  Integration of 4F2 selector-less crossbar array 2Mb ReRAM based on transition metal oxides for high density memory applications , 2012, 2012 Symposium on VLSI Technology (VLSIT).

[33]  Tao Zhang,et al.  CREAM: A Concurrent-Refresh-Aware DRAM Memory architecture , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[34]  Jun Yang,et al.  Energy reduction for STT-RAM using early write termination , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[35]  O. Richard,et al.  10×10nm2 Hf/HfOx crossbar resistive RAM with excellent performance, reliability and low-energy operation , 2011, 2011 International Electron Devices Meeting.

[36]  Mircea R. Stan,et al.  Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[37]  A. Robert Calderbank,et al.  Coset coding to extend the lifetime of memory , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[38]  Norman P. Jouppi,et al.  Rethinking DRAM design and organization for energy-constrained multi-cores , 2010, ISCA.

[39]  Parthasarathy Ranganathan,et al.  From Microprocessors to Nanostores: Rethinking Data-Centric Systems , 2011, Computer.

[40]  Engin Ipek,et al.  Dynamically replicated memory: building reliable systems from nanoscale resistive memories , 2010, ASPLOS XV.

[41]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[42]  Karin Strauss,et al.  Use ECP, not ECC, for hard failures in resistive memories , 2010, ISCA.

[43]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[44]  Masahide Matsumoto,et al.  A 130.7mm2 2-layer 32Gb ReRAM memory device in 24nm technology , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[45]  Yukio Hayakawa,et al.  An 8 Mb Multi-Layered Cross-Point ReRAM Macro With 443 MB/s Write Throughput , 2012, IEEE Journal of Solid-State Circuits.

[46]  Lizy Kurian John,et al.  Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[47]  Shimeng Yu,et al.  A Phenomenological Model for the Reset Mechanism of Metal Oxide RRAM , 2010, IEEE Electron Device Letters.

[48]  Onur Mutlu,et al.  Base-delta-immediate compression: Practical data compression for on-chip caches , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[49]  Karin Strauss,et al.  Zombie memory: Extending memory lifetime by reviving dead blocks , 2013, ISCA.

[50]  Rajeev Balasubramonian,et al.  MemZip: Exploring unconventional benefits from memory compression , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[51]  Hyunjin Lee,et al.  Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[52]  Frederick T. Chen,et al.  Evidence and solution of over-RESET problem for HfOX based resistive memory with sub-ns switching speed and high endurance , 2010, 2010 International Electron Devices Meeting.

[53]  K QureshiMoinuddin,et al.  Scalable high performance main memory system using phase-change memory technology , 2009 .