Power and energy reduction of racetrack-based caches by exploiting shared shift operations

In this paper, we propose a technique for reducing the power and energy consumptions of the racetrack-based caches. The technique uses a mapping method from the logical cache lines to the physical domains of the nanowires. The mapping method exploits the fact that, in a nanowire with several access heads, the shift operations are shared by the heads on that nanowire. Utilizing this inherent sharing, fewer nanowires are shifted to make a cache line available for both the read and write accesses. By using this method, the cache sets are shifted separately, which results in increase in the number of average shift operations. Thanks to the sharing of the shift operations among multiple heads, the total power and energy consumption of the shift operations are reduced. The effectiveness of the proposed technique is studied using the PARSEC benchmark package. The study shows that the power, energy consumption, energy-delay-product, and energy-delay-squared-product of L2 caches are reduced, on average, by 53%, 44%, 32%, 17%, respectively, compared to the state-of-the-art mapping methods.

[1]  S. Yuasa,et al.  Giant room-temperature magnetoresistance in single-crystal Fe/MgO/Fe magnetic tunnel junctions , 2004, Nature materials.

[2]  Kaushik Roy,et al.  DWM-TAPESTRI - An energy efficient all-spin cache using domain wall shift based writes , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[3]  S. Parkin,et al.  Magnetic Domain-Wall Racetrack Memory , 2008, Science.

[4]  Hai Li,et al.  Quantitative modeling of racetrack memory, a tradeoff among area, performance, and power , 2015, The 20th Asia and South Pacific Design Automation Conference.

[5]  Yiran Chen,et al.  Exploration of GPGPU register file architecture using domain-wall-shift-write based racetrack memory , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Stuart S. P. Parkin,et al.  Memory on the Racetrack , 2015 .

[7]  Kaushik Roy,et al.  TapeCache: a high density, energy efficient cache based on domain wall memory , 2012, ISLPED '12.

[8]  H. Ohno,et al.  Tunnel magnetoresistance of 604% at 300K by suppression of Ta diffusion in CoFeB∕MgO∕CoFeB pseudo-spin-valves annealed at high temperature , 2008 .

[9]  Seyedhamidreza Motaman,et al.  Domain Wall Memory-Layout, Circuit and Synergistic Systems , 2015, IEEE Transactions on Nanotechnology.

[10]  Weisheng Zhao,et al.  Perpendicular-magnetic-anisotropy CoFeB racetrack memory , 2012 .

[11]  Shoji Ikeda,et al.  A 32-Mb SPRAM With 2T1R Memory Cell, Localized Bi-Directional Write Driver and `1'/`0' Dual-Array Equalized Reference Scheme , 2010, IEEE Journal of Solid-State Circuits.

[12]  Mircea R. Stan,et al.  Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[13]  Wenqing Wu,et al.  Cross-layer racetrack memory design for ultra high density and low power consumption , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[15]  H. Ohno,et al.  Current-induced domain wall motion in perpendicularly magnetized CoFeB nanowire , 2011 .

[16]  Weisheng Zhao,et al.  High Speed, High Stability and Low Power Sensing Amplifier for MTJ/CMOS Hybrid Logic Circuits , 2009, IEEE Transactions on Magnetics.

[17]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[18]  P. Chevalier,et al.  Racetrack memory cell array with integrated magnetic tunnel junction readout , 2011, 2011 International Electron Devices Meeting.

[19]  Weisheng Zhao,et al.  Compact Modeling of Perpendicular-Anisotropy CoFeB/MgO Magnetic Tunnel Junctions , 2012, IEEE Transactions on Electron Devices.

[20]  C. Rettner,et al.  Dynamics of Magnetic Domain Walls Under Their Own Inertia , 2010, Science.