Accurate Inference With Inaccurate RRAM Devices: A Joint Algorithm-Design Solution

Resistive random access memory (RRAM) is a promising technology for energy-efficient neuromorphic accelerators. However, when a pretrained deep neural network (DNN) model is programmed to an RRAM array for inference, the model suffers from accuracy degradation due to RRAM nonidealities, such as device variations, quantization error, and stuck-at-faults. Previous solutions involving multiple read–verify–write (R-V-W) to the RRAM cells require cell-by-cell compensation and, thus, an excessive amount of processing time. In this article, we propose a joint algorithm-design solution to mitigate the accuracy degradation. We first leverage knowledge distillation (KD), where the model is trained with the RRAM nonidealities to increase the robustness of the model under device variations. Furthermore, we propose random sparse adaptation (RSA), which integrates a small on-chip memory with the main RRAM array for postmapping adaptation. Only the on-chip memory is updated to recover the inference accuracy. The joint algorithm-design solution achieves the state-of-the-art accuracy of 99.41% for MNIST (LeNet-5) and 91.86% for CIFAR-10 (VGG-16) with up to 5% parameters as overhead while providing a 15– $150\times $ speedup compared with R-V-W.

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  H. L. Lung,et al.  A Study of Array Resistance Distribution and a Novel Operation Algorithm for WO x ReRAM Memory , 2015 .

[3]  Eric P. Xing,et al.  GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server , 2016, EuroSys.

[4]  H.-S. Philip Wong,et al.  Challenges and opportunities toward online training acceleration using RRAM-based hardware neural network , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[5]  Shubham Sahay,et al.  Recent trends in hardware security exploiting hybrid CMOS-resistive memory circuits , 2017 .

[6]  Yu Cao,et al.  Efficient Network Construction Through Structural Plasticity , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[7]  Anthony J. Kenyon,et al.  Nanosecond Analog Programming of Substoichiometric Silicon Oxide Resistive RAM , 2016, IEEE Transactions on Nanotechnology.

[8]  U-In Chung,et al.  Multi-level switching of triple-layered TaOx RRAM with excellent reliability for storage class memory , 2012, 2012 Symposium on VLSI Technology (VLSIT).

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Y. Wu,et al.  Variation-aware, reliability-emphasized design and optimization of RRAM using SPICE model , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[11]  Yiran Chen,et al.  Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[12]  Indranil Chakraborty,et al.  Technology Aware Training in Memristive Neuromorphic Systems for Nonideal Synaptic Crossbars , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.

[13]  Elad Hoffer,et al.  Scalable Methods for 8-bit Training of Neural Networks , 2018, NeurIPS.

[14]  Shimeng Yu,et al.  Device and system level design considerations for analog-non-volatile-memory based neuromorphic architectures , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[15]  Yiran Chen,et al.  BSB training scheme implementation on memristor-based circuit , 2013, 2013 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).

[16]  Rajiv V. Joshi,et al.  An Energy-Efficient Digital ReRAM-Crossbar-Based CNN With Bitwise Parallelism , 2017, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.

[17]  Saibal Mukhopadhyay,et al.  Design of Reliable DNN Accelerator with Un-reliable ReRAM , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[18]  Yiran Chen,et al.  Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[19]  Ligang Gao,et al.  High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm , 2011, Nanotechnology.

[20]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[21]  Abinash Mohanty,et al.  Towards Efficient Neural Networks On-A-Chip: Joint Hardware-Algorithm Approaches , 2019, 2019 China Semiconductor Technology International Conference (CSTIC).

[22]  Sally A. McKee,et al.  Reflections on the memory wall , 2004, CF '04.

[23]  Ligang Gao,et al.  Programming Protocol Optimization for Analog Weight Tuning in Resistive Memories , 2015, IEEE Electron Device Letters.

[24]  Dmitri B. Strukov,et al.  Improving Noise Tolerance of Mixed-Signal Neural Networks , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[25]  Xiaochen Peng,et al.  DNN+NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).

[26]  Yiran Chen,et al.  Vortex: Variation-aware training for memristor X-bar , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[27]  Meng-Fan Chang,et al.  DL-RSIM: A Simulation Framework to Enable Reliable ReRAM-based Accelerators for Deep Learning , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[28]  Frederick T. Chen,et al.  RRAM Defect Modeling and Failure Analysis Based on March Test and a Novel Squeeze-Search Scheme , 2015, IEEE Transactions on Computers.

[29]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Pritish Narayanan,et al.  Neuromorphic computing using non-volatile memory , 2017 .