Exploring the vulnerability of CMPs to soft errors with 3D stacked non-volatile memory

Spin-transfer Torque Random Access Memory (STT-RAM) emerges for on-chip memory in microprocessor architectures. Thanks to the magnetic field based storage STT-RAM cells have immunity to radiation induced soft errors that affect electrical charge based data storage, which is a major challenge in SRAM based caches in current microprocessors. In this study we explore the soft error resilience benefits and design trade offs of 3D-stacked STT-RAM for multi-core architectures. We use 3D stacking as an enabler for modular integration of STT-RAM caches with minimum disruption in the baseline processor design flow, while providing further interconnectivity and capacity advantages. We take an in-depth look at alternative replacement schemes in terms of performance, power, temperature, and reliability trade-offs to capture the multi-variable optimization challenges microprocessor architectures face. We analyze and compare the characteristics of STT-RAM, SRAM, and DRAM alternatives for various levels of the cache hierarchy in terms of reliability.

[1]  H. Peter Hofstee,et al.  Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..

[2]  Saied Tehrani Status and prospect for MRAM technology , 2010, 2010 IEEE Hot Chips 22 Symposium (HCS).

[3]  Xiaoxia Wu,et al.  Hybrid cache architecture with disparate memory technologies , 2009, ISCA '09.

[4]  Engin Ipek,et al.  Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing , 2010, ISCA.

[5]  William J. Gallagher,et al.  Development of the magnetic tunnel junction MRAM at IBM: From first junctions to a 16-Mb MRAM demonstrator chip , 2006, IBM J. Res. Dev..

[6]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[7]  Joel Emer,et al.  A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[8]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[9]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[10]  Narayanan Vijaykrishnan,et al.  Three-dimensional cache design exploration using 3DCacti , 2005, 2005 International Conference on Computer Design.

[11]  Gu-Yeon Wei,et al.  Process Variation Tolerant 3T1D-Based Cache Architectures , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[12]  Doe Hyun Yoon,et al.  Memory mapped ECC: low-cost error protection for last level caches , 2009, ISCA '09.

[13]  Nanning Zheng,et al.  Using Magnetic RAM to Build Low-Power and Soft Error-Resilient L1 Cache , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[14]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[15]  D. Dimitrov,et al.  Thermal fluctuation effects on spin torque induced switching: Mean and variations , 2008 .

[16]  Z. Diao,et al.  Spin-transfer torque switching in magnetic tunnel junctions and spin-transfer torque random access memory , 2007 .

[17]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[18]  Philip G. Emma,et al.  Is 3D chip technology the next growth engine for performance improvement? , 2008, IBM J. Res. Dev..

[19]  Yiran Chen,et al.  A novel architecture of the 3D stacked MRAM L2 cache for CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[20]  Norman P. Jouppi,et al.  Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[21]  Kaustav Banerjee,et al.  A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[22]  Pradip Bose,et al.  Phaser: Phased methodology for modeling the system-level effects of soft errors , 2008, IBM J. Res. Dev..

[23]  Yuan Xie,et al.  Design space exploration for 3D architectures , 2006, JETC.

[24]  E. Belhaire,et al.  Macro-model of Spin-Transfer Torque based Magnetic Tunnel Junction device for hybrid Magnetic-CMOS design , 2006, 2006 IEEE International Behavioral Modeling and Simulation Workshop.

[25]  Soontae Kim Reducing Area Overhead for Error-Protecting Large L2/L3 Caches , 2009, IEEE Trans. Computers.

[26]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[27]  Shubu Mukherjee,et al.  Architecture Design for Soft Errors , 2008 .