Design of Reliable DNN Accelerator with Un-reliable ReRAM

This paper presents an algorithmic approach to design reliable ReRAM based Processing-in-Memory (PIM) architecture for Deep Neural Network (DNN) acceleration under intrinsic stochastic behavior of ReRAM devices. We employ the dynamical fixed point (DFP) data representation format to adaptively change the decimal point location based on the data range, minimizing the unused most significant bits (MSBs). Further, we propose a device variability aware (DVA) training methodology where stochastic noise is added to the parameters during training to enhance the robustness of network to the parameter’s variation. Simulations indicate that, on average, the proposed algorithms improve the computing accuracy by more than 20% considering various benchmark DNNs (convolutional and recurrent). Moreover, the proposed approach enhances robustness of the DNN to noisy input data.

[1]  Saibal Mukhopadhyay,et al.  ReRAM-Based Processing-in-Memory Architecture for Recurrent Neural Network Acceleration , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[2]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[3]  Yongqiang Lyu,et al.  SNrram: an efficient sparse neural network computation architecture based on resistive random-access memory , 2018, DAC.

[4]  Meng-Fan Chang,et al.  A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[5]  Saibal Mukhopadhyay,et al.  Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator , 2016, ISLPED.

[6]  Y. Wu,et al.  Variation-aware, reliability-emphasized design and optimization of RRAM using SPICE model , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[7]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[8]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[9]  Demis Hassabis,et al.  Artificial Intelligence: Chess match of the century , 2017, Nature.

[10]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Sudhakar Yalamanchili,et al.  A Ferroelectric FET based Power-efficient Architecture for Data-intensive Computing , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[13]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[14]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Meng-Fan Chang,et al.  DL-RSIM: A Simulation Framework to Enable Reliable ReRAM-based Accelerators for Deep Learning , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).