Exploring Hybrid Memory Caches in Chip Multiprocessors

Studies have shown memory needs vary significantly across applications. Recent work has explored using hybrid memory technology (SRAM+NVM) in on-chip memories of multicore processors (CMPs) to support the varied needs of diverse workloads. Such works suggest architectural modifications that require supplemental management in the memory hierarchy. Instead, we propose to deploy hybrid memory in a manner that integrates seamlessly with the existing heterogeneous multicore (HMP) architectural model, and therefore does not require any supplemental management, simply the integration of different memory technologies on-chip. We evaluate platforms with a combination of/ast (SRAM cache) and slow (STT-MRAM cache) core-types for mobile workloads.

[1]  Diana Marculescu,et al.  Procrustes1: Power Constrained Performance Improvement Using Extended Maximize-Then-Swap Algorithm , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Mehdi Baradaran Tahoori,et al.  Evaluation of Hybrid Memory Technologies Using SOT-MRAM for On-Chip Cache Hierarchy , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Nikil D. Dutt,et al.  Exploring fast and slow memories in HMP core types: work-in-progress , 2017, CODES+ISSS.

[4]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[5]  Weng-Fai Wong,et al.  A coherent hybrid SRAM and STT-RAM L1 cache architecture for shared memory multicores , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[6]  Seong-Ook Jung,et al.  Evaluation of STT-MRAM L3 cache in 7nm FinFET process , 2018, 2018 International Conference on Electronics, Information, and Communication (ICEIC).

[7]  Ing-Chao Lin,et al.  High-Endurance Hybrid Cache Design in CMP Architecture With Cache Partitioning and Access-Aware Policies , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[8]  Seyed Ghassem Miremadi,et al.  An Efficient Protection Technique for Last Level STT-RAM Caches in Multi-Core Processors , 2017, IEEE Transactions on Parallel and Distributed Systems.

[9]  Jooheung Lee,et al.  Write-Amount-Aware Management Policies for STT-RAM Caches , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[10]  Amin Jadidi,et al.  A Study on Performance and Power Efficiency of Dense Non-Volatile Caches in Multi-Core Systems , 2017, SIGMETRICS.

[11]  Nikil D. Dutt,et al.  SPARTA: Runtime task allocation for energy efficient heterogeneous manycores , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[12]  Jun Wang,et al.  Energy-Aware Adaptive Restore Schemes for MLC STT-RAM Cache , 2017, IEEE Transactions on Computers.

[13]  Abdoulaye Gamatié,et al.  Embedded systems to high performance computing using STT-MRAM , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[14]  Lieven Eeckhout,et al.  Scheduling heterogeneous multi-cores through performance impact estimation (PIE) , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[15]  Jianhua Li,et al.  STT-RAM based energy-efficiency hybrid cache for CMPs , 2011, 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip.

[16]  Mahdi Fazeli,et al.  FTSPM: A Fault-Tolerant ScratchPad Memory , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[17]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[18]  Xiaoxia Wu,et al.  Hybrid cache architecture with disparate memory technologies , 2009, ISCA '09.

[19]  Gi-Ho Park,et al.  NVM Way Allocation Scheme to Reduce NVM Writes for Hybrid Cache Architecture in Chip-Multiprocessors , 2017, IEEE Transactions on Parallel and Distributed Systems.

[20]  Nikil D. Dutt,et al.  PoliCym: Rapid Prototyping of Resource Management Policies for HMPs , 2017, 2017 International Symposium on Rapid System Prototyping (RSP).

[21]  Kiyoung Choi,et al.  Benzene , 2018, ACM Trans. Archit. Code Optim..

[22]  Tom Zhong,et al.  STT-MRAM for embedded memory applications from eNVM to last level cache , 2017 .

[23]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Jung Ho Ahn,et al.  The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing , 2013, TACO.

[25]  Mahmood Fathy,et al.  OptMem: Dark-silicon aware low latency hybrid memory design , 2016, 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA).

[26]  Wei-Che Tseng,et al.  Towards energy efficient hybrid on-chip Scratch Pad Memory with non-volatile memory , 2011, 2011 Design, Automation & Test in Europe.

[27]  Tong Li,et al.  LinSched: The Linux Scheduler Simulator , 2008, ISCA PDCCS.

[28]  Christian Bienia,et al.  Benchmarking modern multiprocessors , 2011 .

[29]  Anuj Pathania,et al.  Price theory based power management for heterogeneous multi-cores , 2014, ASPLOS.

[30]  Nikil D. Dutt,et al.  Run-DMC: Runtime dynamic heterogeneous multicore performance and power estimation for energy efficiency , 2015, 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).