Domain Wall Memory-Layout, Circuit and Synergistic Systems

Domain wall memory (DWM) is gaining significant attention for embedded cache application due to low standby power, excellent retention, and ability to store multiple bits per cell. Additionally, it provides fast access time, good endurance, and good retention. However, it suffers from poor write latency, shift latency, shift power, and write power. DWM is sequential in nature and latency of read/write operations depends on the offset of the bit from the read/write head. This paper investigates the circuit design challenges such as bitcell layout, head positioning, utilization factor of the nanowire, shift power, shift latency, and provides solutions to deal with these issues. A synergistic system is proposed by combining circuit techniques such as merged read/write heads (for compact layout), flipped-bitcell and shift gating (for shift power optimization), wordline strapping (for access latency), shift circuit design with two micro-architectural techniques: 1) segmented cache and 2) workload-aware dynamic shift and write current boosting to realize energy-efficient and robust DWM cache. Simulations show 3-33% performance and 1.2-14.4X power consumption improvement for cache segregation and 2.5-31% performance and 1.3-14.9X power enhancement for dynamic current boosting over a wide range of PARSEC benchmarks.

[1]  Masamitsu Hayashi,et al.  Current driven dynamics of magnetic domain walls in permalloy nanowires , 2006 .

[2]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[3]  M.H. Kryder,et al.  After Hard Drives—What Comes Next? , 2009, IEEE Transactions on Magnetics.

[4]  Kaushik Roy,et al.  TapeCache: a high density, energy efficient cache based on domain wall memory , 2012, ISLPED '12.

[5]  Kaushik Roy,et al.  DWM-TAPESTRI - An energy efficient all-spin cache using domain wall shift based writes , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[6]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  Mircea R. Stan,et al.  Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[8]  Swaroop Ghosh,et al.  Modeling and analysis of domain wall dynamics for robust and low-power embedded memory , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[9]  Swaroop Ghosh Path to a TeraByte of on-chip memory for petabit per second bandwidth with < 5Watts of power , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Swaroop Ghosh,et al.  Synergistic circuit and system design for energy-efficient and robust domain wall caches , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[11]  Swaroop Ghosh,et al.  Simultaneous sizing, reference voltage and clamp voltage biasing for robustness, self-calibration and testability of STTRAM arrays , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Swaroop Ghosh Design methodologies for high density domain wall memory , 2013, 2013 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH).

[13]  Wenqing Wu,et al.  Cross-layer racetrack memory design for ultra high density and low power consumption , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Shunsuke Fukami,et al.  Micromagnetic analysis of current driven domain wall motion in nanostrips with perpendicular magnetic anisotropy , 2008 .

[15]  Jun Yang,et al.  Energy reduction for STT-RAM using early write termination , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.