SWIM: selective write-verify for computing-in-memory neural accelerators

Computing-in-Memory architectures based on non-volatile emerging memories have demonstrated great potential for deep neural network (DNN) acceleration thanks to their high energy efficiency. However, these emerging devices can suffer from significant variations during the mapping process (i.e., programming weights to the devices), and if left undealt with, can cause significant accuracy degradation. The non-ideality of weight mapping can be compensated by iterative programming with a write-verify scheme, i.e., reading the conductance and rewriting if necessary. In all existing works, such a practice is applied to every single weight of a DNN as it is being mapped, which requires extensive programming time. In this work, we show that it is only necessary to select a small portion of the weights for write-verify to maintain the DNN accuracy, thus achieving significant speedup. We further introduce a second derivative based technique SWIM, which only requires a single pass of forward and backpropagation, to efficiently select the weights that need write-verify. Experimental results on various DNN architectures for different datasets show that SWIM can achieve up to 10x programming speedup compared with conventional full-blown write-verify while attaining a comparable accuracy.

[1]  Yiyu Shi,et al.  RADARS: Memory Efficient Reinforcement Learning Aided Differentiable Neural Architecture Search , 2021, 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC).

[2]  Yiyu Shi,et al.  Uncertainty Modeling of Emerging Device based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search , 2021, 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC).

[3]  Shimeng Yu,et al.  Two-step write–verify scheme and impact of the read noise in multilevel RRAM-based inference engine , 2020, Semiconductor Science and Technology.

[4]  Bin Gao,et al.  Fully hardware-implemented memristor convolutional neural network , 2020, Nature.

[5]  Xiaochen Peng,et al.  DNN+NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).

[6]  X. Hu,et al.  Device-Circuit-Architecture Co-Exploration for Computing-in-Memory Neural Accelerators , 2019, IEEE Transactions on Computers.

[7]  Masanori Hashimoto,et al.  When Single Event Upset Meets Deep Neural Networks: Observations, Explorations, and Remedies , 2019, 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC).

[8]  Engin Ipek,et al.  Making Memristive Neural Network Accelerators Reliable , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[9]  Vivienne Sze,et al.  Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[10]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[11]  Vivienne Sze,et al.  Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.