Accurate Inference with Inaccurate RRAM Devices: Statistical Data, Model Transfer, and On-line Adaptation

Resistive random-access memory (RRAM) is a promising technology for in-memory computing with high storage density, fast inference, and good compatibility with CMOS. However, the mapping of a pre-trained deep neural network (DNN) model on RRAM suffers from realistic device issues, especially the variation and quantization error, resulting in a significant reduction in inference accuracy. In this work, we first extract these statistical properties from 65 nm RRAM data on 300mm wafers. The RRAM data present 10-levels in quantization and 50% variance, resulting in an accuracy drop to 31.76% and 10.49% for MNIST and CIFAR-10 datasets, respectively. Based on the experimental data, we propose a combination of machine learning algorithms and on-line adaptation to recover the accuracy with the minimum overhead. The recipe first applies Knowledge Distillation (KD) to transfer an ideal model into a student model with statistical variations and 10 levels. Furthermore, an on-line sparse adaptation (OSA) method is applied to the DNN model mapped on to the RRAM array. Using importance sampling, OSA adds a small SRAM array that is sparsely connected to the main RRAM array; only this SRAM array is updated to recover the accuracy. As demonstrated on MNIST and CIFAR-10 datasets, a 7.86% area cost is sufficient to achieve baseline accuracy for the 65 nm RRAM devices.

[1]  Karsten Beckmann,et al.  Improving the Memory Window/Resistance Variability Trade-Off for 65nm CMOS Integrated HfO2 Based Nanoscale RRAM Devices , 2019, 2019 IEEE International Integrated Reliability Workshop (IIRW).

[2]  Frederick T. Chen,et al.  RRAM Defect Modeling and Failure Analysis Based on March Test and a Novel Squeeze-Search Scheme , 2015, IEEE Transactions on Computers.

[3]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[4]  Yusuf Leblebici,et al.  Analog Control of Retainable Resistance Multistates in HfO2 Resistive-Switching Random Access Memories (ReRAMs) , 2019, ACS Applied Electronic Materials.

[5]  Saibal Mukhopadhyay,et al.  ReRAM-Based Processing-in-Memory Architecture for Recurrent Neural Network Acceleration , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Yiran Chen,et al.  Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[7]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[8]  Xiaochen Peng,et al.  DNN+NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).

[9]  Yiran Chen,et al.  Vortex: Variation-aware training for memristor X-bar , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Yiran Chen,et al.  BSB training scheme implementation on memristor-based circuit , 2013, 2013 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).

[11]  Saibal Mukhopadhyay,et al.  Design of Reliable DNN Accelerator with Un-reliable ReRAM , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Yuan Taur,et al.  CMOS design near the limit of scaling , 2002 .

[13]  R. Jordan,et al.  NVM neuromorphic core with 64k-cell (256-by-256) phase change memory synaptic array with on-chip neuron circuits for continuous in-situ learning , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[14]  Anand Raghunathan,et al.  CxDNN: Hardware-software Compensation Methods for Deep Neural Networks on Resistive Crossbar Systems , 2020, ACM Trans. Embed. Comput. Syst..

[15]  Yiran Chen,et al.  Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[16]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[17]  Indranil Chakraborty,et al.  Technology Aware Training in Memristive Neuromorphic Systems for Nonideal Synaptic Crossbars , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.

[18]  L. Goux,et al.  Stochastic variability of vacancy filament configuration in ultra-thin dielectric RRAM and its impact on OFF-state reliability , 2013, 2013 IEEE International Electron Devices Meeting.

[19]  Karsten Beckmann,et al.  Synaptic Behavior of Nanoscale ReRAM Devices for the Implementation in a Dynamic Neural Network Array , 2018, 2018 International Integrated Reliability Workshop (IIRW).

[20]  Ligang Gao,et al.  High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm , 2011, Nanotechnology.

[21]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Pritish Narayanan,et al.  Neuromorphic computing using non-volatile memory , 2017 .