Impact of Multilevel Retention Characteristics on RRAM based DNN Inference Engine

In this work, the retention characteristics of multilevel HfO2 resistive random access memory (RRAM) based synaptic array was statistically measured from a 90 nm test chip and modeled at different temperatures. We found that not only the average conductance (especially at the intermediate states) drifts but also the variance of conductance exacerbates at elevated temperatures. To investigate the impact of the synaptic weight drift on deep neural network, the experimental data are modeled into the ResNet-18 simulation with 1-4 weight bit precisions. The result shows that the inference accuracy drops significantly at 55°C or above, which implies further engineering on RRAM retention or circuit/algorithmic techniques are yet to be applied.

[1]  F. Merrikh Bayat,et al.  Fast, energy-efficient, robust, and reproducible mixed-signal neuromorphic classifier based on embedded NOR flash memory technology , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Zhengya Zhang,et al.  A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations , 2019, Nature Electronics.

[4]  Meng-Fan Chang,et al.  24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[5]  Swagath Venkataramani,et al.  PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.

[6]  ChiaHua Ho,et al.  Integrated HfO2-RRAM to achieve highly reliable, greener, faster, cost-effective, and scaled devices , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[7]  Farnood Merrikh-Bayat,et al.  Training and operation of an integrated neuromorphic network based on metal-oxide memristors , 2014, Nature.

[8]  Chih-Yuan Lu,et al.  Optimal Design Methods to Transform 3D NAND Flash into a High-Density, High-Bandwidth and Low-Power Nonvolatile Computing in Memory (nvCIM) Accelerator for Deep-Learning Neural Networks (DNN) , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).

[9]  Shimeng Yu,et al.  Two-step write–verify scheme and impact of the read noise in multilevel RRAM-based inference engine , 2020, Semiconductor Science and Technology.

[10]  Pritish Narayanan,et al.  Equivalent-accuracy accelerated neural-network training using analogue memory , 2018, Nature.

[11]  Yandong Luo,et al.  Investigation of Read Disturb and Bipolar Read Scheme on Multilevel RRAM-Based Deep Learning Inference Engine , 2020, IEEE Transactions on Electron Devices.

[12]  S. Datta,et al.  In-Memory Computing Primitive for Sensor Data Fusion in 28 nm HKMG FeFET Technology , 2018, 2018 IEEE International Electron Devices Meeting (IEDM).

[13]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[14]  John Paul Strachan,et al.  Low‐Conductance and Multilevel CMOS‐Integrated Nanoscale Oxide Memristors , 2019, Advanced Electronic Materials.

[15]  S. Ambrogio,et al.  Confined PCM-based Analog Synaptic Devices offering Low Resistance-drift and 1000 Programmable States for Deep Learning , 2019, 2019 Symposium on VLSI Technology.

[16]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.