论文信息 - Accurate Inference with Inaccurate RRAM Devices: Statistical Data, Model Transfer, and On-line Adaptation

Accurate Inference with Inaccurate RRAM Devices: Statistical Data, Model Transfer, and On-line Adaptation

Resistive random-access memory (RRAM) is a promising technology for in-memory computing with high storage density, fast inference, and good compatibility with CMOS. However, the mapping of a pre-trained deep neural network (DNN) model on RRAM suffers from realistic device issues, especially the variation and quantization error, resulting in a significant reduction in inference accuracy. In this work, we first extract these statistical properties from 65 nm RRAM data on 300mm wafers. The RRAM data present 10-levels in quantization and 50% variance, resulting in an accuracy drop to 31.76% and 10.49% for MNIST and CIFAR-10 datasets, respectively. Based on the experimental data, we propose a combination of machine learning algorithms and on-line adaptation to recover the accuracy with the minimum overhead. The recipe first applies Knowledge Distillation (KD) to transfer an ideal model into a student model with statistical variations and 10 levels. Furthermore, an on-line sparse adaptation (OSA) method is applied to the DNN model mapped on to the RRAM array. Using importance sampling, OSA adds a small SRAM array that is sparsely connected to the main RRAM array; only this SRAM array is updated to recover the accuracy. As demonstrated on MNIST and CIFAR-10 datasets, a 7.86% area cost is sufficient to achieve baseline accuracy for the 65 nm RRAM devices.

[1] Karsten Beckmann,et al. Improving the Memory Window/Resistance Variability Trade-Off for 65nm CMOS Integrated HfO2 Based Nanoscale RRAM Devices , 2019, 2019 IEEE International Integrated Reliability Workshop (IIRW).

[2] Frederick T. Chen,et al. RRAM Defect Modeling and Failure Analysis Based on March Test and a Novel Squeeze-Search Scheme , 2015, IEEE Transactions on Computers.

[3] Christopher M. Bishop,et al. Current address: Microsoft Research, , 2022 .

[4] Yusuf Leblebici,et al. Analog Control of Retainable Resistance Multistates in HfO2 Resistive-Switching Random Access Memories (ReRAMs) , 2019, ACS Applied Electronic Materials.

[5] Saibal Mukhopadhyay,et al. ReRAM-Based Processing-in-Memory Architecture for Recurrent Neural Network Acceleration , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6] Yiran Chen,et al. Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[7] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[8] Xiaochen Peng,et al. DNN+NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).

[9] Yiran Chen,et al. Vortex: Variation-aware training for memristor X-bar , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[10] Yiran Chen,et al. BSB training scheme implementation on memristor-based circuit , 2013, 2013 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).

[11] Saibal Mukhopadhyay,et al. Design of Reliable DNN Accelerator with Un-reliable ReRAM , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12] Yuan Taur,et al. CMOS design near the limit of scaling , 2002 .

[13] R. Jordan,et al. NVM neuromorphic core with 64k-cell (256-by-256) phase change memory synaptic array with on-chip neuron circuits for continuous in-situ learning , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[14] Anand Raghunathan,et al. CxDNN: Hardware-software Compensation Methods for Deep Neural Networks on Resistive Crossbar Systems , 2020, ACM Trans. Embed. Comput. Syst..

[15] Yiran Chen,et al. Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[16] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[17] Indranil Chakraborty,et al. Technology Aware Training in Memristive Neuromorphic Systems for Nonideal Synaptic Crossbars , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.

[18] L. Goux,et al. Stochastic variability of vacancy filament configuration in ultra-thin dielectric RRAM and its impact on OFF-state reliability , 2013, 2013 IEEE International Electron Devices Meeting.

[19] Karsten Beckmann,et al. Synaptic Behavior of Nanoscale ReRAM Devices for the Implementation in a Dynamic Neural Network Array , 2018, 2018 International Integrated Reliability Workshop (IIRW).

[20] Ligang Gao,et al. High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm , 2011, Nanotechnology.

[21] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23] Pritish Narayanan,et al. Neuromorphic computing using non-volatile memory , 2017 .