论文信息 - An Overview of In-memory Processing with Emerging Non-volatile Memory for Data-intensive Applications

An Overview of In-memory Processing with Emerging Non-volatile Memory for Data-intensive Applications

The conventional von Neumann architecture has been revealed as a major performance and energy bottleneck for rising data-intensive applications. The decade-old idea of leveraging in-memory processing to eliminate substantial data movements has returned and led extensive research activities. The effectiveness of in-memory processing heavily relies on memory scalability, which cannot be satisfied by traditional memory technologies. Emerging non-volatile memories (eNVMs) that pose appealing qualities such as excellent scaling and low energy consumption, on the other hand, have been heavily investigated and explored for realizing in-memory processing architecture. In this paper, we summarize the recent research progress in eNVM-based in-memory processing from various aspects, including the adopted memory technologies, locations of the in-memory processing in the system, supported arithmetics, as well as applied applications.

[1] Thomas P. Parnell,et al. Temporal correlation detection using computational phase-change memory , 2017, Nature Communications.

[2] Yuhua Cheng,et al. Solenoid Model for the Magnetic Flux Leakage Testing Based on the Molecular Current , 2018, IEEE Transactions on Magnetics.

[3] Kaushik Roy,et al. In-situ, In-Memory Stateful Vector Logic Operations based on Voltage Controlled Magnetic Anisotropy , 2018, Scientific Reports.

[4] Onur Mutlu,et al. The reach profiler (REAPER): Enabling the mitigation of DRAM retention failures via profiling at aggressive conditions , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[5] Engin Ipek,et al. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning , 2017 .

[6] Feifei Li,et al. NDC: Analyzing the impact of 3D-stacked memory+logic devices on MapReduce workloads , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[7] Abu Sebastian,et al. Tutorial: Brain-inspired computing using phase-change memory devices , 2018, Journal of Applied Physics.

[8] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[9] Yiran Chen,et al. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[10] Jun Yang,et al. DrAcc: a DRAM based Accelerator for Accurate CNN Inference , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[11] Y.C. Chen,et al. Write Strategies for 2 and 4-bit Multi-Level Phase-Change Memory , 2007, 2007 IEEE International Electron Devices Meeting.

[12] Yuan Xie,et al. DRISA: A DRAM-based Reconfigurable In-Situ Accelerator , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[13] Uri C. Weiser,et al. MAGIC—Memristor-Aided Logic , 2014, IEEE Transactions on Circuits and Systems II: Express Briefs.

[14] C. Wright,et al. Arithmetic and Biologically-Inspired Computing Using Phase-Change Materials , 2011, Advanced materials.

[15] Kaushik Roy,et al. X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.

[16] Yiran Chen,et al. ReRAM-based accelerator for deep learning , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17] Sujan Kumar Gonugondla,et al. A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array , 2018, IEEE Journal of Solid-State Circuits.

[18] Yu Hu,et al. Power-Utility-Driven Write Management for MLC PCM , 2017, ACM J. Emerg. Technol. Comput. Syst..

[19] Shaahin Angizi,et al. HielM: Highly flexible in-memory computing using STT MRAM , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[20] Yiran Chen,et al. GraphR: Accelerating Graph Processing Using ReRAM , 2017, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[21] Dong Li,et al. A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-Volatile On-Chip Caches , 2015, IEEE Transactions on Parallel and Distributed Systems.

[22] Cong Xu,et al. Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[23] Anand Raghunathan,et al. Computing in Memory With Spin-Transfer Torque Magnetic RAM , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24] A. Sebastian,et al. Compressed sensing recovery using computational memory , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[25] Haralampos Pozidis,et al. Recent Progress in Phase-Change Memory Technology , 2016, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[26] Kaushik Roy,et al. Future cache design using STT MRAMs for improved energy efficiency: Devices, circuits and architecture , 2012, DAC Design Automation Conference 2012.

[27] Pritish Narayanan,et al. Experimental Demonstration and Tolerancing of a Large-Scale Neural Network (165 000 Synapses) Using Phase-Change Memory as the Synaptic Weight Element , 2014, IEEE Transactions on Electron Devices.

[28] Abu Sebastian,et al. Accumulation-Based Computing Using Phase-Change Memories With FET Access Devices , 2015, IEEE Electron Device Letters.

[29] Shimeng Yu,et al. Metal–Oxide RRAM , 2012, Proceedings of the IEEE.

[30] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[31] Mingyu Gao,et al. HRL: Efficient and flexible reconfigurable logic for near-data processing , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[32] Zhaohao Wang,et al. In-Memory Processing Paradigm for Bitwise Logic Operations in STT–MRAM , 2017, IEEE Transactions on Magnetics.

[33] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.

[34] Chenchen Liu,et al. Build reliable and efficient neuromorphic design with memristor technology , 2019, ASP-DAC.

[35] Huanrui Yang,et al. AtomLayer: A Universal ReRAM-Based CNN Accelerator with Atomic Layer Computation , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[36] Zhuo Wang,et al. In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array , 2017, IEEE Journal of Solid-State Circuits.

[37] C. Wright,et al. Beyond von‐Neumann Computing with Nanoscale Phase‐Change Memory Devices , 2013 .

[38] Yiran Chen,et al. RED: A ReRAM-based Deconvolution Accelerator , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[39] Chun Chen,et al. The architecture of the DIVA processing-in-memory chip , 2002, ICS '02.

[40] Chaitali Chakrabarti,et al. A Parallel RRAM Synaptic Array Architecture for Energy-Efficient Recurrent Neural Networks , 2018, 2018 IEEE International Workshop on Signal Processing Systems (SiPS).

[41] Pritish Narayanan,et al. Equivalent-accuracy accelerated neural-network training using analogue memory , 2018, Nature.

[42] Yiran Chen,et al. Understanding the trade-offs of device, circuit and application in ReRAM-based neuromorphic computing systems , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[43] Sparsh Mittal,et al. A survey of architectural techniques for improving cache power efficiency , 2014, Sustain. Comput. Informatics Syst..

[44] Tao Zhang,et al. Overcoming the challenges of crossbar resistive memory architectures , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[45] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.

[46] Engin Ipek,et al. Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing , 2010, ISCA.

[47] Youguang Zhang,et al. A Multilevel Cell STT-MRAM-Based Computing In-Memory Accelerator for Binary Convolutional Neural Network , 2018, IEEE Transactions on Magnetics.

[48] Sudhakar Yalamanchili,et al. Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[49] Heiner Giefers,et al. Mixed-precision in-memory computing , 2017, Nature Electronics.

[50] Jung Ho Ahn,et al. NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[51] Yu Wang,et al. TIME: A training-in-memory architecture for memristor-based deep neural networks , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[52] Mehdi Baradaran Tahoori,et al. A cross-layer adaptive approach for performance and power optimization in STT-MRAM , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[53] D. Ielmini,et al. Logic Computation in Phase Change Materials by Threshold and Memory Switching , 2013, Advanced materials.

[54] Shimeng Yu,et al. Emerging Memory Technologies: Recent Trends and Prospects , 2016, IEEE Solid-State Circuits Magazine.

[55] Eby G. Friedman,et al. AC-DIMM: associative computing with STT-MRAM , 2013, ISCA.

[56] Y. Leblebici,et al. Large-scale neural networks implemented with non-volatile memory as the synaptic weight element: Comparative performance analysis (accuracy, speed, and power) , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[57] Mike Ignatowski,et al. TOP-PIM: throughput-oriented programmable processing in memory , 2014, HPDC '14.